
Artificial intelligence life cycle


Join Starburst on April 17th for the next iteration of our Live Demo Series
Cindy Ng
Sr. Manager, Content
Starburst
Cindy Ng
Sr. Manager, Content
Starburst
SEMMA and CRISP-DM are both process models used in the field of data mining and machine learning to guide the steps involved in developing predictive models and extracting useful insights from data.
While they share some similarities, they also have distinct differences. Below is a comparison of SEMMA and CRISP-DM.
CRISP-DM: Developed in the late 1990s, CRISP-DM is a comprehensive and widely recognized framework for data mining projects. It was designed to provide a structured approach to guide the entire data mining process, from understanding business objectives to deploying models.
SEMMA: SEMMA was developed by SAS (a software company) as a framework for their data mining software. It focuses primarily on the modeling phase and is more specific to SAS’s software suite. However, it has also been used more broadly in the context of data analysis and modeling.
CRISP-DM: CRISP-DM defines six distinct phases:
CRISP-DM covers the entire data mining project lifecycle, including understanding business goals, data collection and preparation, model building, evaluation, and deployment.
SEMMA: SEMMA outlines five key phases:
SEMMA focuses primarily on the modeling phase, offering guidance on data sampling, exploration, modification, modeling, and model assessment.
CRISP-DM: CRISP-DM is considered a more flexible and comprehensive framework, suitable for a wide range of data mining and machine learning projects.
SEMMA: SEMMA is more specific to SAS software and is often used as a companion to other, more comprehensive methodologies like CRISP-DM.
CRISP-DM is widely adopted and has extensive documentation and support from the data mining community. It is generally seen as a practical and effective methodology for data mining projects.
SEMMA, while useful for model-building within the SAS environment, may be less familiar and less widely adopted outside of the SAS user base.
CRISP-DM is a more comprehensive and widely accepted data mining process model that covers the entire project lifecycle.
SEMMA, on the other hand, is a more specialized framework, primarily focusing on the modeling phase and is closely associated with SAS software.
The choice between the two depends on the specific needs and tools of a given project, with CRISP-DM as a more general and flexible approach.