This portion of the site uses web technologies and standards which are not compatible with your current browser. Please consider using another browser or upgrading to fully experience the site.
These should be:
Make It Happen+
Identify the key business stakeholders who either impact or are impacted by the targeted business initiative.Learn more about building stakeholder personas
Make It Happen+
What are the key business entities that either impact or are impacted by the organization's key business initiative?Learn more about identifying strategic nouns
Make It Happen+
This is perhaps the hardest part of the "thinking like a data scientist" exercise, which involves examining your strategic nouns from 3 perspectives...Learn more about how to determine stakeholder key business decisions
Understanding what happened
How many widgets did I sell last month?
Predicting what will happen
How many widgets will I sell next month?
Recommending what to do next
How much of component Z should I order?
Make It Happen+
This is an exploratory technique of examining a strategic entity by its data attributes. This can uncover:
Make It Happen+
Look for groupings of strategic noun dimensions and attributes that can be combined to create a more predictive and actionable score.Learn more about
Make It Happen+
Deliver analytics-driven scores and recommendations to the key business stakeholders.Learn more about putting analytics into action
Make It Happen+
The Data Lake was born out of the "economics of big data" that allow organizations to store massive amounts of data at a cost that can be 20x to 50x cheaper than traditional data warehouse technologies. Due to the agile underlying Hadoop/HDFS architecture that typically supports the Data Lake, organizations can store structured data (relational tables, csv files), semi-structured data (web logs, sensor logs, beacon feeds) and unstructured data (text files, social media posts, photos, images, video) as-is without the time-consuming and agility-limiting need to pre-define a data schema on load.
However, the real power of the Data Lake is to enable the data science team to utilize advanced analytics against a growing variety of internal and external – structured and unstructured – data sources in an attempt to uncover new variables and metrics that are better predictors of performance.
The EMC approach accommodates open technologies at every stage but EMC, VCE, VMware, and Pivotal products can help you get a big data analytics solution up and running more quickly and with additional functionality required for an enterprise environment.
EMC Isilon is a scale-out NAS storage platform with native multi-protocol support, including Hadoop, eliminates inefficient storage silos, provides consistent security, and speeds time to insights.
Alternatively, VCE Block is pre-integrated stack that combines server, shared storage, network devices, virtualization, and management, to speed Hadoop deployments.
EMC Elastic Cloud Storage Appliance is a powerful hyper scale geo-distributed object and HDFS storage platform for geo-scale analytics and Multi-Cloud API's to seamlessly connect to public clouds.
And VMWare Big Data Extensions are an extension of VMware vSphere that enables you to deploy, run, and manage a virtual Hadoop cluster. Big Data Extensions enables the rapid deployment of Hadoop clusters on VMware vSphere. Big Data Extensions provides a simple deployment toolkit that can be accessed through VMware vCenter Server to deploy a highly available Hadoop cluster in minutes using the Big Data Extensions user interface.
Pivotal Big Data Suite is an integration of Pivotal technologies with unlimited use of Pivotal HD to store all your data, accelerate processing, and increase the amount of data being analyzed and operationalized. Pivotal HD is a commercially-supported enterprise-ready, Hadoop distribution that ensures you can harness the massive data being driven by new apps, systems, machines and the torrent of customer sources.
With a rich and compliant Structured Query Language (SQL) dialect, Pivotal HAWQ® supports application portability and a large ecosystem of data analysis and data visualization tools such as SAS, Tableau and more. Analytic applications written over HAWQ are easily portable to other SQL compliant data engines, and vice versa. This prevents vendor lock-in for the enterprise and fosters innovation, while containing business risk. Pivotal HAWQ provides strong support for low-latency analytic SQL queries, coupled with massively parallel machine learning capabilities.
Pivotal Big Data Suite can be deployed as part of PaaS technologies, on-premise and in public clouds, in virtualized environments, on commodity hardware or delivered as an appliance.
Pivotal Big Data Suite portfolio is compatible with distributions of Open Data Platform (ODP) versions of Hadoop. All components are distributions of open source projects or are in the process of becoming open source projects.
Pivotal Cloud Foundry is an industry-leading, enterprise platform-as-a-service solution, powered by Cloud Foundry. It delivers an always-available, turnkey experience for scaling and updating applications on the private cloud.
Streamline application development, deployment and operation on a centrally-managed Platform-as-a-Service for public and private cloud. Streamline IT development with full visibility and control over your application lifecycle, provisioning, deployment, upgrades and security patches.
Accelerate time-to-value through automated deployment of analytic systems on virtualized infrastructure utilizing shared storage for immediate data access from all applications (I.e. No data copy operations to DAS). EMC built an extensible platform that allows fast integration of new analytic applications and platform components, from ingest, indexing and data security applications. We support 3rd party and open source applications so your business can run analytics its own way.
Bill Schmarzo developed a maturity model to help businesses understand where they are with big data proficiency. Businesses can use this to identify the transformational changes they need to make in order to gain big data capabilities, operationalize them, and use them to drive new types of value for IT and the lines of business.
Many organizations today find themselves in within the first two phases.
In the first phase, Business Monitoring, an organization deploys business intelligence to monitor current business performance. This is often a “rear view mirror” approach of reporting on the past.
In phase 2, Business Insights, organizations leverage predictive analytics to uncover actionable insights that can be integrated into existing reports and dashboards.
Phase 3, Business Optimization, is where organizations embed predictive analytics into existing business processes to optimize select business operations. This is a pivot point where the mirror begins to look toward the future and starts to drive business opportunities.
Phase 4, Data Monetization is reached when organizations creates new revenue opportunities, such as 1) reselling data and analytics, 2) creating “intelligent” products, or 3) over-hauling the customer engagement experience.
Phase 5, Business Metamorphosis is achieved when organizations leverage customers’ usage patterns, product performance behaviors, and market trends to create entirely new business models.
The Big Data Vision Workshop from EMG Global Services seeks to align business and IT goals around big data, identify strategic opportunities for big data analytics, prioritize key use cases by assessing feasibility and business benefits, demonstrate the potential value using data science techniques, and recommend the appropriate analytics engagement and deployment roadmap.
For those who need to understand how their big data analytic use case will generate insight that turns into value, the Proof of Value service demonstrates the value of analytics and data science. The project will source and prepare the data relevant for a chosen use case, perform the statistical analysis, and then share final findings and analytical models. The Service generates a ‘minimum viable product’ app or process to demonstrate how business value can be created using the models and determine the ROI. EMC will recommend any necessary changes to people and process, document a business justification and provide a roadmap for implementation into a production environment.
EMC services professionals will stand up a data lake architecture so it's ready to execute on the target use case.
The configuration will be customized to the customer environment and the particular data requirements for the analytics use case. EMC Global Services automates the data ingest and processing, develops appropriate data governance and security controls, and build the analytics application into a business process. The data lake platform will be ready to execute on countless future use cases such as: enhancing customer experience, improving marketing effectiveness, streamlining operations, or developing new products.
EMC offers a range of education services to help business leaders, aspiring big data practitioners, and seasoned data scientists increase their effectiveness with big data. We offer a 90-minute course for business leaders to develop a baseline understanding of data science and big data to help them identify opportunities and integrate big data into their business strategies.
For big data analytics practitioners and team leads, we have 1-day and 5-day courses that utilize industry specific examples to explore team development, data science concepts, analytic approaches, tools, and advanced methods and hands-on labs. We offer advanced-level 5-day courses for specific methods and tools with labs and EMC Proven Data Science Certification.
Finally, we offer technology focused training on the core elements of the Federation Business Data Lake including the Islion, Pivotal HD and ECS components.
Key business initiatives include what the organization plans to achieve with their business strategy over the next 9-12 months; usually includes business objectives, financial targets, metrics and timeframe.
A Business Initiative supports the business strategy and has the following characteristics:
Their Key Business Initiatives could be:
We want to develop personas for each of the business stakeholders to understand better their work characteristics and job characteristics. Understanding this helps to capture the decisions and questions that these stakeholders must address with respect to the targeted business initiative.
A persona is a 1-2 page “day in the life” description that makes the key business stakeholder “come to life” for the data science and User Experience (UEX) development teams. Personas are useful in understanding the goals, tasks, key decisions, and pain points of the key business stakeholders. The persona helps the data science team to identify the most appropriate data sources and analytic techniques to support the decisions that the business users are trying to make and the questions that they are trying to answer. Personas are created for each type of business stakeholders affected by the given business initiative.
Strategic nouns are critical to data scientists' thinking process because these are the entities from which to gain new, actionable insights, that ultimately help build analytic profiles.
Examples of strategic nouns include:
For the "Improve Merchandising Effectiveness" business initiative, the strategic nouns could be:
Brainstorm with each of the different stakeholders the decisions they need to make with respect to each strategic noun or key business entity in support of the targeted business initiative.
For their "Improve Merchandising Effectiveness" Business Initiative, we want to brainstorm the "Customer"strategic noun questions as such:
The “By” analysis technique exploits a business user’s natural “question and answer” enquiry process to identify new data sources, dimensional characteristics, variables and metrics that could be leveraged by the data science team in building the predictive and prescriptive analytic models to help predict business performance. The “By” analysis leverages a business stakeholder’s natural curiosity to brainstorm new:
The “By” analysis uses a simple “I want to [verb] [metric] by [dimensional attribute]” format to capture the business stakeholder brainstorming process and uncover new data and analytic requirements. The “By” analysis format looks like such as:
“I want to”
Here is a “By” analysis example:
Here is an example of "By" analysis for hypothetical merchandising, using customer questions to improve merchandising effectiveness:
The significant number and variety of “By” dimensions and attributes that can surface in a brainstorming session can lead to incredible insight. And remember as you go through this process, all ideas are worthy of consideration; this is not the point to try to filter the creative ideas or handcuff the creative thinking process!
The purpose of the “Score” technique is to look for groupings of strategic noun dimensions and attributes that can be combined to create a more predictive and actionable score. These scores are critical components of our “thinking like a data scientist” process by supporting the decisions that we are trying to make, and/or what actions or outcomes we are trying to predict with respect to our targeted business initiative. Scores are very important constructs in the world of data science, and can help to cement the business stakeholders’ buy-in to the data science process. The best familiar score example might be the FICO score, which combines multiple questions and dimensions about a loan applicant’s finance history to create a single score that lenders use to predict a borrower’s ability to repay a loan.
Here are some examples of scoring opportunities for Sports Shop and variables that would contribute to them:
Facilitate the development of a compelling and actionable user experience by starting with a simple “Recommendations Worksheet.” The “Recommendations Worksheet” ties the decisions that our business stakeholders need to make (captured in Step 4) to the predictive analytics or scores that that the data science team is going to need to build. The “Recommendations Worksheet” starts with the decisions captured in Step 4, and then identifies the potential recommendations that could be delivered to the business users (or consumers) in support of those decisions. Finally, the worksheet captures the potential scores (and the supporting variables and metrics) that can be used to power the recommendations.
For our Sports Shop "Improve Merchandising Effectiveness" business initiative, the resulting Recommendations Worksheet could look like: