As dashboards, performance reports, and meeting invitations are piling up, a decision must be made or a strategy needs to be developed. This typical workplace scenario can be overwhelming and can make people fixated on numbers. Unfortunately, organizations sometimes heavily rely on such figures to make decisions that prioritize the company’s interests over those of people and the planet. To ensure that a company’s approach to performance reporting and strategic decision-making remains effective and holistic, it is important to understand the different statistical factors causing inaccuracy and inconsistency.
1.Weighted performance scores
Setting priorities and focusing on what matters by correlating importance with a weighted value will definitely facilitate the performance reporting process and is integral to calculating an overall performance score. However, according to The KPI Institute, establishing weight could be subjective and misleading if one is not careful about how to report it because changing weights from one period to another leads to inconsistency of data in time and creates a fallacy of performance improvement or regression.
For example, a report could be presented to the management regarding the improvement in the performance of the sales department from the year 2021 to the year 2022 and that the department performance score has improved by 4%. However, there is a possibility that their performance has not changed at all if all they have done was change the weight of the underperforming key performance indicators (KPIs) to get different results.
To avoid such a misleading presentation, it is important to redirect the conversation to the metrics’ results and not the weighted score or the target. This will drive the decision-maker to concentrate on what matters and see the bigger picture.
Figure 1. Sales department’s performance in 2022 and 2021
Arguably, the most common form of misrepresentation in graphs is when its Y-axis is manipulated to exaggerate the differences between bars. The truncated graph is produced when the axis starts from a value other than zero. This might give an illusion that the differences are high.
Even if the audience is informed that the Y-axis was truncated, a study found that they still overestimate the actual differences, and the results are often substantial.
Start the axis at zero to show the true context of the data. This will ensure that the data is presented naturally and accurately, reducing the chances of being misinterpreted.
Figure 2. The differences between a regular graph and a truncated graph
3. Small sample size
Darrell Huff, the author of “How to Lie with Statistics”, believed that samples can be trusted only if they are statistically significant. Hence, to be worth much more, a report or analysis based on sampling must use a representative sample, and for it to be relevant, every source of bias must be removed.
For example, a skincare company might advertise its facial wash by putting “users report 85% less skin breakout” on its packages. Upon closer inspection, one may discover that the test group of users consisted of only 15 people. This sample size works well for the skincare company because it is easier to repeat the experiment in a smaller group until they get their desired result while ignoring any findings that do not support how they want to promote their product. Sooner or later, a focus group will show a significant improvement worthy of a headline, and an advertising campaign based on a deceiving result will be created.
There is no clear-cut answer on what the appropriate sample size is since it will be highly dependent on the research type and population size, and there are many statistical methods and equations that could help determine the appropriate sample size, such as Herbert Arkin’s formula or Krejcie and Morgan Table. In addition, when reporting the survey or research results, it’s important to be transparent with the recipient audience and communicate the survey methodology, population, and the logic behind determining the sample size.
4. Misleading averages
While averages can give a simple summary of large data, they might also give half-truths. Case in point, if the range between numbers is too high, then the average would be meaningless. It is optimal that when an average is presented in data, it should be introduced along with supporting facts to provide in-depth analysis and make the right decision at the right time for the business. It is, therefore, important not just to rely on an average but also to look at the distribution of data, understand the range, and consider other statistical measures like median and mode. The median represents central tendency measurements instead of the average or mean because it is less impacted by outliers.
To conclude, it’s crucial for all stakeholders—whether they’re performance specialists or executives at the top management—to understand how human biases can infiltrate numbers and form a narrative that is completely different from reality. This can be avoided by ensuring that the reported data covers all the quality aspects of completeness, uniqueness, conformity, consistency, and timeliness. In addition, organizations must establish a data validation process to ensure the credence of performance reports.
**********
About the Guest Expert:Wedad Alsubaie, a seasoned Senior Strategy Management Officer at the National Unified Procurement Company (NUPCO), holds certifications in strategic business planning and KPI and performance management. With extensive experience in enhancing corporate and individual performance, she led the performance development program in Mobily and is now focused on corporate strategy and performance management at NUPCO.
Editor’s Note: This was originally published in Performance Magazine Issue No. 29, 2024 – Strategy Management Edition.
Isn’t it fascinating how charts can dissolve divides and turn disconnect into discoveries? It’s a purposeful medium where even the fragments and imbalance in the world are always part of the story. By unifying or restructuring grand and small narratives in a more distilled expression, charts make complex truths accessible to a wider audience. Consequently, the audience become not just spectators but also collaborators, peeling meaning from their respective perspectives or contributing to a common understanding. However, producing charts with this level of effectiveness doesn’t just happen through the mere union of analysis and aesthetics.
It starts with choosing the right chart for your data. That choice should be driven by intuitive wisdom and structured control, and both are anchored on purpose. By defining what you need to bring out of the data, you eliminate the unessential and get your message across as quickly and as clearly as possible.
What’s Your Purpose?
Comparison is a common purpose for data visualization, and the bar chart is the tool to make this happen. Bar charts use one categorical variable and one numerical variable like income data across departments. For more nuanced comparisons, clustered bar charts can be used to introduce another categorical variable, like seniority levels within departments. Bar charts, as shown in Figure 1, have horizontal bars, while another type, the column chart, uses vertical bars.
Figure 1. A bar chart (left) and a column chart (right) | Image Source: Financial Edge
Another purpose for data visualization is to show composition, specifically the contribution of parts to a whole. This type of visualization generally involves one categorical variable and one numerical variable. Composition charts in a matched pair usually take the form of a pie chart or donut chart (see Figure 2). An example of what they can demonstrate is the percentage of customers who prefer one product over another.
Figure 1: A pie chart (left) and a donut chart (right) | Image Source: Medium
More complex data with multiple categorical variables will benefit more from using stackedbar charts, which are similar to bar charts butwith multiple categorical variables represented as additional parts stacked alongside each other (see Figure 3). Alternatively, complex data can be illustrated using a treemap, which involves dividing a rectangular area into smaller rectangles to represent hierarchical data such as income distribution across regions and cities (see Figure 4).
Figure 3. A stacked bar chart | Image Source: Medium
If your purpose is to show change over time in charts, it can be done with line charts or area charts (see Figure 5). The line chart works better when it comes to a numerical variable with a time series, such as monthly revenue. The continuity of the line allows the viewer’s eye to easily notice trends and changes over time. The area chart goes one step further by shading the area under the line, emphasizing both the change and its magnitude. For more compact visualizations within tables, sparklines can be used. These are small charts that may be placed in a sheet’s individual cells and can highlight trends in big data sets or point out maximum and minimum values (see Figure 6).
Figure 5. A line chart (left) and an area chart (right) | Image Source: Edraw
As for visualizing relationships between variables, go with a scatter plot or bubble chart for numerical data and a heat table for categorical data. A scatter plot displays data points to show the correlationof two numerical variables, while a bubble chart is similar to a scatter plot, in which the x- and y-axes consist of two numerical variables, but the bubbles (or circles) in it vary in size to encode a third numerical variable (see Figure 7).
Figure 7. A scatter plot (left) | Image Source: Medium | and a bubble chart (right) | Image Source: Medium
For categorical data, a heat table has one categorical variable placed in rows and another placed in columns. The cells of the table are then coded with numerical values, often by way of different intensities of color. This is a particularly useful way to identify patterns or relationships between categorical variables, such as countries and performance scores (see Figure 8).
When you’re working with geospatial data, you might find a choropleth map more suitable (see Figure 9). It plots a numeric variable like population density over a geospatial variable such as regions or countries. This type of map illuminates the perception and realization of the spatial pattern by shading particular regions with differing tones.
The right chart for your data isn’t always immediately obvious. By establishing your purpose first, you narrow down your choices. You avoid overcomplicating or underrepresenting information. Another layer of factors to consider is the type of data you have and its size. And beneath all of these is your audience. From their familiarity with charts to the complexity of your data, decision-making always involves the people you are creating a chart for in the first place.
To learn more about data visualization, check out more articles here.
**********
Editor’s Note: This article was written in collaboration with Islam Salahuddin, data consultant at Systaems.
In the dynamic business landscape, strategic decision-making is the compass guiding organizations toward success. At the heart of this process lies data, the invaluable asset that fuels analytics and shapes the trajectory of strategic initiatives. However, the accuracy and reliability of data are the linchpins that determine the efficacy of these decisions. From this perspective, data governance plays a pivotal role in maintaining data integrity.
Data governance is a set of policies, processes, roles, and standards that ensure the effective and efficient use of data across an organization, in addition to compliance with relevant regulations and ethical principles. Data governance aims to establish clear rules and responsibilities for data creation, collection, storage, access, sharing, analysis, and reporting.
In that sense, a robust data governance strategy is indispensable in the context of strategic analytics. A data governance strategy is crucial for maintaining data accuracy and reliability and ensuring that the information driving decision-making processes is consistent, timely, and aligned with organizational goals.
assigning ownership of data assets and defining roles for data stewards, data analysts, and other stakeholders
implementing data quality KPIs and procedures to monitor and improve data accuracy and completeness
implementing robust data security measures to securely store and access data and to comply with data privacy regulations and ethics
investing in data management tools to automate data cleaning, profiling, and lineage tracking tasks
promoting a data-driven culture by educating employees on data governance policies and best practices
There are several examples of how the aforementioned measures help companies with their strategy management. One of these case studies is Wells Fargo, one of the largest banks in the United States. Wells Fargo adopted a data governance operating model, which defines the roles, responsibilities, and processes for the effective management of data across the organization. The company was better able to do this by connecting its data sources with a data fabric that integrates inputs from multiple systems.
In another case study, GE Aviation, the aviation division of General Electric, consolidated its scattered data sources in a data lake. A data lake is a large-scale data storage and processing platform that can handle structured and unstructured data from various sources, making information more manageable, reliable, and accessible for all users within the organization.
The two examples show that strategy management improved when built on accurate and reliable data, leading to better outcomes. Simply put, if the accuracy of the inputs is jeopardized, then the outputs can only be expected to be flawed.
**********
Editor’s Note: This article was originally published in the print edition of Performance Magazine Issue No. 29, 2024 – Strategy Management Edition
Despite the continuous hype around data analytics and the rapid acceleration of data technologies such as machine learning (ML) and artificial intelligence (AI), most companies are lagging behind with low data capabilities and no in-house data team in place. These companies have their data either fully unleveraged or marginally analyzed by executives on the side of their jobs to produce limited reports.
In such a situation, pushing the organization up the hill of data maturity would require building a team of data-specialized personnel. Building such a team can be daunting, as every company would have different conditions and no one way can fit all cases. However, covering the following main grounds can help cut miles on the road to building a data team from the ground up.
First, nurture the environment and plant the seeds. Data teams cannot grow in a vacuum. To prepare the organization to become data-driven with a data team, enhancing the organizational data culture is a good starting point. Having employees at all levels with a data-driven mentality and an understanding of the role of data analytics can significantly prepare the room for the planned team.
Second, connect with stakeholders and recognize priority needs. Carrying out data culture programs inside the organization can open up opportunities to have meaningful discussions with stakeholders on different levels about their data needs, what they already do with data, and what they want to achieve, in addition to having better insights into the pre-existing data assets. This is a good stage to recognize the organization’s data pain points, which would then be the immediate and strategic objectives of the future data team.
Third, define the initial structure of the team. According to the scale of the organization and the identified needs, data teams can have one of three main structures:
Centralized: This involves having all data roles within one team reporting to one head, chief data officer (CDO), or a similar role. All departments in the organization would request their needs from the team. This is a straightforward approach, especially for small-size companies, but can end up in a bottleneck if not scaled up continuously to meet the organization’s growing needs.
Decentralized: This requires disseminating all data roles and infusing them into departmental teams. This mainly aims to close the gap between technical analysis and business benefits as analysts in every team would be experts in their functional areas. However, the approach may lead to inconsistencies in data management and fragile data governance.
Hybrid: This consists of having governance, infrastructure, and data engineering roles within a core team, along with embedding data analysts, business analysts, and data scientists in departmental teams. The allocated personnel would report to the respective department head as well as the data team head. This approach combines the benefits of both centralized and decentralized structures and is usually applicable in large organizations as they require more headcount in their data teams.
Fourth, map the necessary tech stack and data roles. As the previous stages have uncovered the current uses and needs of data in an organization, it should be easier to start figuring out the tech tools that the team would be initially working with. Mapping the needed tech stack would be the first pillar before moving on to the hiring process. The second pillar would involve defining the roles that the team would need in its nascent stage to meet the prioritized objectives.
Several data job titles can be combined in a data team, with many of them having specializations that intersect with or bisect each other. However, there are three main role areas that should be considered for starting data teams:
Data engineering: implementing and managing data storage systems, integrating scattered datasets, and building pipelines to prepare data for analysis and reporting
Data analysis: performing final data preparation and extracting main insights to inform decision-making
Data science: building automated analysis and reporting systems, usually concerned with predictive and prescriptive machine learning models
Fifth, follow step-by-step team recruitment. Hiring new employees for the data team is one option. The other option can be upskilling existing employees with an interest in a data career and with minimum required skills. Even employees with just interest and no minimum required skills can be reskilled to fill some roles, especially within an initial data team.
The team does not need to take off with full wings. It can start small and gradually grow. Typically, data teams would start with data analysts who have extra skills in data engineering, data engineers who have experience with ad-hoc analyses and reporting, or a limited combination of both. In later stages, other titles can join onboard.
The baby-step-building approach is more convincing for stakeholders as it can be more efficient from a return-on-investment (ROI) perspective. Starting with a full-capacity team may end up being too costly for the organization, which could lead to the budding project being cut off in its prime.
Sixth, deliver ad-hoc analyses, heading towards long-term projects. In the beginning, data analytics experts at the organization would be expected to answer random requests and solve urgent data-related problems, like developing quick reports and reporting on-spot metrics. This is a good point to prove how data personnel can be of direct benefit to the organization.
However, along with delivering said ad-hoc requests, the data team should have strategic goals to enhance and develop the overall data maturity of the organization, like organizing, integrating, and automating the analytics processes and installing advanced predictive models. These long-term projects should foster the organization’s data maturity, which should result in ad-hoc requests being less frequent as all executives should be self-sufficient in using the installed automated reports and systems. In such a data-mature environment, the team would have time to advance their data products continuously, opening up new benefit opportunities.
Seventh, fortify the team’s presence. Strategic projects with shorter implementation periods and more immediate impact should be prioritized over longer ones, especially in the beginning. That would help continuously prove the benefits of the data team and the point of its foundation. Owning the products of the data team by having its name on it can help remind decision-makers of the team’s benefit. In addition, it is highly useful for the data team’s head to have access to top managerial levels to keep promoting the team’s presence and expansion.
Building a data team from scratch requires careful planning, investment, and commitment from organizational leadership. By following these guidelines and adapting them to their specific needs, organizations without prior data capabilities can establish a robust data team capable of driving innovation and offering a competitive advantage through data-driven insights.
Learn more about data management by exploring our articles on data analytics.
**********
Editor’s Note: This post was originally published on April 23, 2024 and last updated on September 17, 2024.
You’ve probably heard tech buzzwords like data-driven decision making, advanced analytics, “artificial intelligence (AI), and so on. The similarity between those terms is that they all require data. There is a famous quote in the computer science field — “garbage in, garbage out” — and it is a wonderful example of how poor data leads to bad results, which leads to terrible insight and disastrous judgments. Now, what good is advanced technology if we can’t put it to use?
The problem is clear: organizations need to have a good data management system in place to ensure they have relevant and reliable data. Data management is defined by Oracle as “the process of collecting, storing, and utilizing data in a safe, efficient, and cost-effective manner.” If the scale of your organization is large, it is very reasonable to employ a holistic platform such as an enterprise resource planning (ERP) system.
On the other hand, if your organization is still in its mid to early stages, it is likely that you cannot afford to employ ERP yet. However, this does not mean that your organization does not need data management. Data management with limited resources is still possible as long as the essential notion of effective data management is implemented.
Here are the four fundamental tips to start data management:
Develop a clear data storage system – Data collection, storage, and retrieval are the fundamental components of a data storage system. You can start small by developing a simple data storage system. Use cloud-based file storage, for example, to begin centralizing your data. Organize the data by naming folders and files in a systematic manner; this will allow you to access your data more easily whenever you need it.
Protect data security and set access control – Data is one of the most valuable assets in any organization. Choose a safe, reliable, and trustworthy location (if physical) or service provider (if cloud-based). Make sure that only the individuals you approve have access to your data. This may be accomplished by adjusting file permissions and separating user access rights.
Schedule a routine data backup procedure – Although this procedure is essential, many businesses still fail to back up their data on a regular basis. By doing regular backups, you can protect your organization against unwanted circumstances such as disasters, outages, and so forth. Make sure that your backup location is independent of your primary data storage location. It could be a different service provider or location, as long as the new backup storage is also secure.
Understand your data and make it simple – First, you must identify what data your organization requires to meet its objectives. The specifications may then be derived from the objectives. For example, if you are aiming to develop an employee retention program, then you will need data on employee turnover to make your data more focused and organized. Remove any data that is irrelevant to the objectives of your organization, including redundant or duplicate data.
Data management has become a necessity in today’s data-driven era. No matter what size and type of your organization, you should start doing it now. Good data management is still achievable, even with limited resources. The tips presented are useful only as a starting point for your data management journey.
Learn more about data management by exploring our articles on data analytics.
**********
Editor’s Note: This post was originally published on December 9, 2021 and last updated on September 17, 2024.