This portion of the site uses web technologies and standards which are not compatible with your current browser. Please consider using another browser or upgrading to fully experience the site.
Think Like a Data Scientist
How do data scientists utilize predictive and prescriptive analytics to create business value?
These should be:
Critical to immediate-term performance
Documented (communicated internally/publicly)
Cross-Functional (involving multiple business functions)
Championed by a senior business executive
Measurable against clear financial goals
Time Bound that is well-defined
Advantageous (deliver financial or competitive advantage)
Make It Happen+
Develop Stakeholder Personas
Identify the key business stakeholders who either impact or are impacted by the targeted business initiative.Learn more about building stakeholder personas
Make It Happen+
Identify Strategic Nouns
What are the key business entities that either impact or are impacted by the organization's key business initiative?Learn more about identifying strategic nouns
Make It Happen+
Capture Business Decisions
Document business stakeholder key decisions and write brief descriptions.Learn more about capturing business decisions
How much stuff do I need?
How many staff should be working?
How much of product X should I stock?
When is the best time to order more product?
Make It Happen+
Brainstorm Business Questions
This is perhaps the hardest part of the "thinking like a data scientist" exercise, which involves examining your strategic nouns from 3 perspectives...Learn more about how to brainstorm business questions
Understanding what happened
How many widgets did I sell last month?
Predicting what will happen
How many widgets will I sell next month?
Recommending what to do next
How much of component Z should I order?
Make It Happen+
Leverage "By" Analysis.
This is an exploratory technique of examining a strategic entity by its data attributes. This can uncover:
- Additional data sources
- Additional dimensional entity characteristics
- Additional areas for analytics exploration
"Show me Customer
- Remodel Date
- Day of Week
- Customer demo
Make It Happen+
Create Actionable Scores
Look for groupings of strategic noun dimensions and attributes that can be combined to create a more predictive and actionable score.Learn more about
Make It Happen+
Put Analytics Into Action
Deliver analytics-driven scores and recommendations to the key business stakeholders.Learn more about putting analytics into action
Make It Happen+
DATA LAKE FOR BIG DATA STORAGE AND ANALYTICS
The traditional driving force behind a data lake strategy is the economics of Big Data storage and management in an environment of rapidly growing unstructured data. By consolidating data and eliminating expensive and inefficient storage silos, organizations can significantly reduce costs and streamline management. With a data lake, organizations can also provide more consistent levels of data protection and security to meet their specific governance and compliance requirements.
Beyond these core benefits, leading organizations recognize that the real power of the data lake is to enable their data science teams to quickly and easily apply powerful Big Data analytics that can unlock the value of Big Data assets, gain new insight and accelerate the success of the organization. With the 'in-place analytics' capabilities of a data lake, data scientists can initiate data analytics projects immediately and without the expense of investing in a separate analytics infrastructure or the time-consuming need to copy and move large data sets.
To realize the many advantages of a data lake, organizations need a Big Data storage infrastructure with multi-protocol capabilities, including native support for the Hadoop Distributed File System (HDFS), to enable the data lake to support a wide range of applications and workloads including Big Data analytics. Many organizations also need a flexible data lake infrastructure that can extend to enterprise edge locations including remote and branch offices as well as to the cloud.
Big Data Infrastructure
With Dell EMC, you can take your data lake strategy to the next level. An Dell EMC data lake allows you to store, manage, protect and analyze data while gaining breakthrough efficiency, scalability, and business agility, from edge-to-core-to cloud. Key elements of an Dell EMC data lake solution are described below:
The industry leading scale-out NAS platform, Isilon is ideal for Big Data storage and analytics. Isilon is simple to manage and scales easily to 68 PB in a single cluster. With native multi-protocol support, including HDFS, Isilon supports Big Data analytics and a wide range of other applications and workloads on a single platform. With Isilon CloudPools software, you can seamlessly integrate your on-premise Isilon storage with a choice of public or private cloud storage providers. IsilonSD Edge software defined storage allows you easily to integrate date from edge locations such as remote and branch offices to your core data center. In this way, Isilon enables you to extend your data lake from edge-to-core-to-cloud.
VCE Vblock Systems
VCE Vblock Systems simplify all aspects of IT and enable organizations to achieve better business outcomes faster with the world’s most advanced converged infrastructure. With flexible options like VCE technology extensions for Isilon, you can deploy a platform that advances development, QA and production lifecycles while modernizing and consolidating data center footprints. Harness the power of converged infrastructure to successfully deploy an enterprise data lake with built-in support for Hadoop and other Big Data analytics environments.
Elastic Cloud Storage (ECS)
Dell EMC Elastic Cloud Storage is a powerful hyper scale geo-distributed object and HDFS storage platform for geo-scale analytics and Multi-Cloud API's to seamlessly connect to public clouds.
And VMWare Big Data Extensions are an extension of VMware vSphere that enables you to deploy, run, and manage a virtual Hadoop cluster. Big Data Extensions enables the rapid deployment of Hadoop clusters on VMware vSphere. Big Data Extensions provides a simple deployment toolkit that can be accessed through VMware vCenter Server to deploy a highly available Hadoop cluster in minutes using the Big Data Extensions user interface.
Big Data Analytics
Pivotal Big Data Suite is an integration of Pivotal technologies with unlimited use of Pivotal HD to store all your data, accelerate processing, and increase the amount of data being analyzed and operationalized.
With a rich and compliant Structured Query Language (SQL) dialect, Pivotal HAWQ® supports application portability and a large ecosystem of data analysis and data visualization tools such as SAS, Tableau and more. Analytic applications written over HAWQ are easily portable to other SQL compliant data engines, and vice versa. This prevents vendor lock-in for the enterprise and fosters innovation, while containing business risk. Pivotal HAWQ provides strong support for low-latency analytic SQL queries, coupled with massively parallel machine learning capabilities.
ivotal Big Data Suite can be deployed as part of PaaS technologies, on-premise and in public clouds, in virtualized environments, on commodity hardware or delivered as an appliance.
Pivotal Big Data Suite portfolio is compatible with distributions of Open Data Platform (ODP) versions of Hadoop. All components are distributions of open source projects or are in the process of becoming open source projects.
Big Data Applications
Pivotal Cloud Foundry is an industry-leading, enterprise platform-as-a-service solution, powered by Cloud Foundry. It delivers an always-available, turnkey experience for scaling and updating applications on the private cloud.
Streamline application development, deployment and operation on a centrally-managed Platform-as-a-Service for public and private cloud. Streamline IT development with full visibility and control over your application lifecycle, provisioning, deployment, upgrades and security patches.
Accelerate time-to-value through automated deployment of analytic systems on virtualized infrastructure utilizing shared storage for immediate data access from all applications (I.e. No data copy operations to DAS). Dell EMC built an extensible platform that allows fast integration of new analytic applications and platform components, from ingest, indexing and data security applications. We support 3rd party and open source applications so your business can run analytics its own way.
Bill Schmarzo developed a maturity model to help businesses understand where they are with Big Data proficiency. Businesses can use this to identify the transformational changes they need to make in order to gain Big Data capabilities, operationalize them, and use them to drive new types of value for IT and the lines of business.
- Business Monitoring is how organizations begin with Big Data, by deploying business intelligence tools to monitor current business performance. This approach is about reporting on the past to know what happened, such as how many widgets I sold last month, or profit for the last quarter.
- Business Insights At the next stage of maturity, organizations use analytics to drive insights that predict what will happen and integrate the insights into existing reports and dashboards, such as how many widgets will I sell next month, or projections for profits next quarter.
- Business Optimization is when organizations embed predictive and prescriptive analytics into existing business processes to optimize select business operations. This is the point where the analytics are providing guidance (tell me what I should do), such as telling me how many widgets to order to cover sales next month, or telling you to hire 4 new sales reps to cover expected seasonal demand.
- Data Monetization is reached when organizations create new revenue opportunities, such as 1) reselling data and analytics, 2) creating “intelligent” products, or 3) over-hauling the customer engagement experience.
- Business Metamorphosis is achieved when organizations leverage customers' usage patterns, product performance behaviors, and market trends to create entirely new business models, such as how Amazon transformed from an online bookstore to become the world's largest retailer, or think of GE Aviation selling thrust instead of jet engines, or John Deere selling farming optimization, or Florida Light and Power selling energy optimization.
Currently, many organizations find themselves within the first two stages. And, they are generating business value in these stages. Our mission is to help organizations advance so they can uncover and execute on the highest-value business opportunities that will transform their businesses. We do it by starting with your strategic initiatives and business outcomes in mind.
STARTING POINT FOR BUSINESS LEADERS
Success starts with aligning IT and the business around a single strategic business initiative within a 9-12 month timeframe. This helps us identify an analytics use case that will accelerate a current business goal or solve a current problem. You need to deliver the right analytic recommendations to the data science teams – the workhorses of your Big Data ecosystem – to help them surface insights that can drive business value.
Big Data Vision Workshop
We have a unique methodology to identify and prioritize a single analytics use case with the best combination of implementation feasibility and business value. It's a 3-week engagement that applies research, interviews, data science expertise and techniques to your business – culminating in a 1-day workshop to identify and agree on the best analytics use case and path forward to solving a business problem. This approach sets us apart from the "bring in a bunch of technology and see what it can do" approach that's pushed by many vendors. We call this a Big Data Vision Workshop. The Big Data Vision Workshop from EMG Global Services aligns business and IT goals around Big Data, identify strategic opportunities for Big Data analytics, prioritize key use cases by assessing feasibility and business benefits, demonstrate the potential value using data science techniques, and recommend the appropriate analytics engagement and deployment roadmap. Learn more here.
STARTING POINT FOR IT LEADERS
Some organizations have already made progress implementing certain data and analytics use cases, and now the IT organization seeks to expand its capabilities and operationalize the processes, to meet growing demands for better/faster data and analytics. But what often happens is that IT hits a technology wall, because the underlying infrastructure, tools and processes don't support the new demands of the business. Some typical scenarios we see are gaps in the Big Data capabilities within the IT environment, and long delays in delivery of incoming requests for data and analytics. Uniquely, Dell EMC helps you understand your technology gaps in context to your business goals.
BIG DATA TECHNOLOGY ADVISORY SERVICE
Our Big Data Technology Advisory service helps your IT organization quickly understand its technology gaps with respect to your Big Data requirements and provide a roadmap and plan to integrate the capabilities you need. Although we often recommend a data lake with Hadoop as a foundational component of a Big Data architecture, we avoid recommending technology simply because a customer wants to "do" Big Data. Instead, we help you consider the technical capabilities required for your unique data sources and strategic business objectives so you can make the right recommendations about your future-state architecture. The target capabilities could include: data ingest challenges, ETL Offload, Data Discovery and Profiling, Rapid Environment Provisioning, or implementing components for Data-as-a-service. This process includes:
- Identify target capabilities for Big Data and analytics
- Assess current/desired state and obstacles
- Perform gap analysis of capabilities and technology
- Develop future state architecture
- Deliver technology roadmap and implementation plan
Once you know your technology gaps and future state architecture, our Proof of Technology Service lets you pilot the recommended architecture to validate that the hardware, software, and integration works with your existing environment and deliver the capabilities you need to meet the requirements of the business. The value to you is a validated architecture customized to your environment and needs. And we have implementation services to put your optimal architecture into production. Learn more here.
Reskill for Digital Transformation
Dell EMC offers a range of education services to help business leaders, aspiring Big Data practitioners, and seasoned data scientists increase their effectiveness with Big Data. We offer a 90-minute course for business leaders to develop a baseline understanding of data science and Big Data to help them identify opportunities and integrate Big Data into their business strategies.
For Big Data analytics practitioners and team leads, we have 1-day and 5-day courses that utilize industry specific examples to explore team development, data science concepts, analytic approaches, tools, and advanced methods and hands-on labs. We offer advanced-level 5-day courses for specific methods and tools with labs and Dell EMC Proven Data Science Certification.
Finally, we offer technology focused training on the core elements of the Federation Business Data Lake including the Islion, Pivotal HD and ECS components.
Step 1: How do I identify my Key Business Initiatives?
Key business initiatives include what the organization plans to achieve with their business strategy over the next 9-12 months; usually includes business objectives, financial targets, metrics and timeframe.
A Business Initiative supports the business strategy and has the following characteristics:
- Critical to the immediate-term business and/or financial performance (usually 9 to 12 month timeframe)
- Documented (communicated either internally or publicly)
- Cross-functional (involves more than one business function)
- Owned or championed by a senior business executive
- Has a measurable financial goal
- Has a well-defined delivery timeframe
- Delivers compelling financial or competitive advantage
What's Important to Sports Shop?
Their Key Business Initiatives could be:
- Improve merchandising effectiveness
- Develop a compelling apparel assortment
- Make our stores and internet sites exciting places to shop and buy
- Increase the productivity of employees
Step 2: Develop Stakeholder Personas
We want to develop personas for each of the business stakeholders to understand better their work characteristics and job characteristics. Understanding this helps to capture the decisions and questions that these stakeholders must address with respect to the targeted business initiative.
A persona is a 1-2 page “day in the life” description that makes the key business stakeholder “come to life” for the data science and User Experience (UEX) development teams. Personas are useful in understanding the goals, tasks, key decisions, and pain points of the key business stakeholders. The persona helps the data science team to identify the most appropriate data sources and analytic techniques to support the decisions that the business users are trying to make and the questions that they are trying to answer. Personas are created for each type of business stakeholders affected by the given business initiative.
Stakeholder Persona for Sports Shop
Step 3: How To Identify Strategic Nouns
Strategic nouns are critical to data scientists' thinking process because these are the entities from which to gain new, actionable insights, that ultimately help build analytic profiles.
Examples of strategic nouns include:
- Wind Turbines
Strategic Nouns for Sports Shop
For the "Improve Merchandising Effectiveness" business initiative, the strategic nouns could be:
Step 4: How Do I Capture Key Business Decisions?
What decisions do the business stakeholders need to make about the strategic nouns, in support of the targeted business initiative. What data insight would support those decisions? These help to form the basis for generating an actionable analytics recommendation that can accelerate a targeted key business initiative.
Capturing and validating these decisions is critical to the "Thinking like a data scientist" process. Leading organizations like Uber and Netflix are disruptive because they build a business model that seeks to simplify their targeted customers' key "decisions.” For Uber, one of the customer decisions that they address is "How do I easily get from Point A to Point B?" For Netflix, one of the customer decisions that they address is "What content (movie, TV show) can I easily watch tonight?"
Key Business Decisions for Sports Shop
We want to capture the decisions (where decision is defined as a conclusion or resolution reached after consideration) especially in light of the entity’s business initiatives. For our sports shop's "Improve Merchandising Effectiveness" Business Initiative, we are likely going to make decisions around product placement, special offers, and promotions.
Examples of Business Decisions for Sports Shop:
- Which products should be featured prominently?
- How should I bundle products to drive revenue per transaction?
- How many employees should be on the floor versus behind the registers?
- How can I sell more gift cards
- How should I inform rewards card members of special offers?
- Are Buy 1 Get 1 Free offers more attractive to customers than 50% off?
- When should I start Back to School and Black Friday promotions?
- What is the right balance of men’s versus women’s items?
- What is the right balance of clothing versus sporting goods?
Step 5: How Do I Brainstorm Business Questions?
Brainstorm with each of the different stakeholders the decisions they need to make with respect to each strategic noun or key business entity in support of the targeted business initiative.
The Evolution of Analytic Questions
- How many widgets did I sell last month?
- What were sales by zip code for Christmas last year?
- How many of Product X were returned last month?
- What were company revenues and profits for the past quarter?
- How many employees did I hire last year?
What Will Happen?
- How many widgets will I sell next month?
- What will be sales by zip code over this Christmas season?
- How many of Product X will be returned next month?
- What are projected company revenues and profits for next quarter?
- How many employees will I need to hire next year?
What Should I Do?
- Order [5,000] component Z to support widget sales for next month
- Hire [Y] new sales reps by these zip codes to handle projected Christmas sales
- Set aside [$125K] in financial reserve to cover Product X returns
- Sell the following product mix to achieve quarterly revenue and margin goals
- Increase hiring pipeline by 35% to achieve hiring goals
Key Business Questions for Sports Shop
For their "Improve Merchandising Effectiveness" Business Initiative, we want to brainstorm the "Customer"strategic noun questions as such:
Descriptive Analytics (Understanding what happened)
- What customers are most receptive to what types of merchandising campaigns?
- What are the characteristics of customers (e.g.,age, gender, customer tenure, life stage, favorite sports) who are most responsive to merchandising offers?
- Are there certain times of year where certain customers are more responsive?
Predictive Analytics (Predicting what will happen)
- Which customers are most likely to respond to a Back to School event?
- Which customers are most likely to respond to a BOGOF offer?
- Which customers are most likely to respond to a 50% off in-store markdown?
Prescriptive Analytics (Recommending what to do next)
- What personalized offers (recommendations) should I deliver to Anne Smith to get her to come into the store?
Step 6: What is "By" Analysis?
The “By” analysis technique exploits a business user’s natural “question and answer” enquiry process to identify new data sources, dimensional characteristics, variables and metrics that could be leveraged by the data science team in building the predictive and prescriptive analytic models to help predict business performance. The “By” analysis leverages a business stakeholder’s natural curiosity to brainstorm new:
- Metrics, measures and key performance indicators
- Dimensions (e.g., strategic nouns) and the attributes and characteristics associated with those dimensions or strategic nouns
- Areas for potential analytics exploration
The “By” analysis uses a simple “I want to [verb] [metric] by [dimensional attribute]” format to capture the business stakeholder brainstorming process and uncover new data and analytic requirements. The “By” analysis format looks like such as:
“I want to”
- Verb such as [see, know, report, compare, trend, plot, predict, test score]
- Metric such as [sales, margin, profits, social media posts, comments, physician notes, vibration levels, sensor codes]
- Dimension or dimensional attribute such as [city, state, zip code, date, time, seasonality, product category, remodel date, store manager demographics]
Here is a “By” analysis example:
- I want to [report] [sales and product margin] by… [product category, store, store remodel date, day of week, store demographics, and customer demographics]
"By" Analysis for Sports Shop
Here is an example of "By" analysis for hypothetical merchandising, using customer questions to improve merchandising effectiveness:
What customers are most receptive to what types of merchandising campaigns by:
- Marital Status
- Number of children
- Length of marriage
- Income level
- Education level
- Loyalty card member
- Own or rent residence
- Tenure in current home
- Value of current home
- Favorite sports
- Favorite teams
- High school sports
- College sports
- Weekend sports
- Active athlete?
- Exercise minutes/week
- Types of exercise
- Level of athletic effort
The significant number and variety of “By” dimensions and attributes that can surface in a brainstorming session can lead to incredible insight. And remember as you go through this process, all ideas are worthy of consideration; this is not the point to try to filter the creative ideas or handcuff the creative thinking process!
Step 7: What is Score Technique?
The purpose of the “Score” technique is to look for groupings of strategic noun dimensions and attributes that can be combined to create a more predictive and actionable score. These scores are critical components of our “thinking like a data scientist” process by supporting the decisions that we are trying to make, and/or what actions or outcomes we are trying to predict with respect to our targeted business initiative. Scores are very important constructs in the world of data science, and can help to cement the business stakeholders’ buy-in to the data science process. The best familiar score example might be the FICO score, which combines multiple questions and dimensions about a loan applicant’s finance history to create a single score that lenders use to predict a borrower’s ability to repay a loan.
- Retirement Readiness
- Investment Risk
- Attrition Risk
- Fraud Risk
- Product Preferences
- Equipment Maintenance
- Supplier Reliability
- Supplier Quality
- Customer LTV
- Gaming Preferences
- Graduation Readiness
- Cohorts Influence
- Wellness Condition
- Stress Risk
- Energy Efficiency
- Conservation Effectiveness
- Fatigue Factor
- Motivation Factor
Scores for Sports Shop
Here are some examples of scoring opportunities for Sports Shop and variables that would contribute to them:
Step 8: Put Analytics Into Action
Facilitate the development of a compelling and actionable user experience by starting with a simple “Recommendations Worksheet.” The “Recommendations Worksheet” ties the decisions that our business stakeholders need to make (captured in Step 4) to the predictive analytics or scores that that the data science team is going to need to build. The “Recommendations Worksheet” starts with the decisions captured in Step 4, and then identifies the potential recommendations that could be delivered to the business users (or consumers) in support of those decisions. Finally, the worksheet captures the potential scores (and the supporting variables and metrics) that can be used to power the recommendations.
Analytics Into Action for Sports Shop
For our Sports Shop "Improve Merchandising Effectiveness" business initiative, the resulting Recommendations Worksheet could look like: