In May 2023, Samsung Electronics prohibited its employees from using generative artificial intelligence (AI) tools like ChatGPT. The ban was issued in an official memo, after discovering that staff had uploaded sensitive code to the platform, which prompted security and privacy concerns for stakeholders, fearing sensitive data leakage. Apple and several Wall Street Banks have also enforced similar bans.
While generative AI contributes to increased efficiency and productivity in businesses, what makes it susceptible to security risks is also its core function: taking the user’s input (prompt) to generate content (response), such as text, codes, images, videos, and audio in different formats. The multiple sources of data, the involvement of third-party systems, and human factors influencing the adoption of generative AI add to the complexity. Failing to properly prepare for and manage security and privacy issues that come with using generative AI may expose businesses to potential legal repercussions.
Safety depends on where data is stored
So, the question becomes, how can businesses use generative AI safely? The answer resides in where the user’s data (prompts and responses) gets stored. The data storage location in turn depends on how the business is using generative AI, of which there are two main methods.
Off-shelf tools: The first method is to use ready-made tools, like OpenAI’s ChatGPT, Microsoft’s Bing Copilot, and Google’s Bard. These are, in fact, nothing but applications with user interfaces that allow them to interact with the base technology that is underneath, namely large language models (LLMs). LLMs are pieces of code that tell machines how to respond to the prompt, enabled by their training on huge amounts of data.
In the case of off-the-shelf tools, data resides in the service provider’s servers—OpenAI’s in the instance of ChatGPT. As a part of the provider’s databases, users have no control over the data they provide to the tool, which can cause great dangers, like sensitive data leakage.
How the service provider treats user data depends on each platform’s end-user license agreement (EULA). Different platforms have different EULAs, and the same platform typically has different ones for its free and premium services. Even the same service may change its terms and conditions as the tool develops. Many platforms have already changed their legal bindings over their short existence.
In-house tools: The second way is to build a private in-house tool, usually by directly deploying one of the LLMs on private servers or less commonly by building an LLM from scratch.
Within this structure, data resides in the organization’s private servers, whether they are on-premises or on the cloud. This means that the business can have far more control over the data processed by its generative AI tool.
Ensuring the security of off-the-shelf tools
Ready-made tools exempt users from the high cost of technology and talent needed to develop their own or outsource the task to a third party. That is why many organizations have no alternative but to use what is on the market, like ChatGPT. The risks of using off-the-shelf generative AI tools can be mitigated by doing the following:
Review the EULAs. In this case, it is crucial to not engage with these tools haphazardly. First, organizations should survey the available options and consider the EULAs of the ones of interest, in addition to their cost and use cases. This includes keeping an eye on the EULAs even after adoption as they are subject to change.
Establish internal policies. When a tool is picked for adoption, businesses need to formulate their own policies on how and when their employees may use it. This includes what sort of tasks can be entrusted to AI and what information or data can be fed into the service provider’s algorithms.
As a rule of thumb, it is advisable not to throw sensitive data and information into others’ servers. Still, it is up to each organization to settle on what constitutes “sensitive data” and what level of risk it is willing to tolerate that can be weighed out by the benefits of the tool adoption.
Ensuring the security of in-house tools
The big corporations that banned the use of third-party services ended up developing their internal generative AI tools instead and incorporated them into their operations. In addition to the significant security advantages, developing in-house tools allows for their fine-tuning and orienting to be domain and task-specific, not to mention gaining full control over their interface user experience.
Check the technical specifications. Developing in-house tools, however, does not absolve organizations from security obligations. Typically, internal tools are built on top of an LLM that is developed by a tech corporation, like Meta AI’s LLaMa, Google’s BERT, or Hugging Face’s BLOOM. Such major models, especially open-source ones, are developed with high-level security and privacy measures, but each has its limitations and strengths.
Therefore, it would still be crucial to first review the adopted model’s technical guide and understand how it works, which would not only lead to better security but also a more accurate estimation of technical requirements.
Initiate a trial period. Even in the case of building the LLM from scratch, and in all cases of AI tool development, it is imperative to test the tool and enhance it both during and after development to ensure safe operation before being rolled out. This includes fortifying the tool against prompt injections, which can be used to manipulate the tool to perform damaging cyber-attacks that include leaking sensitive data even if they reside in internal servers.
Parting words: be wary of hype
While on the surface, the hype surrounding generative AI offers vast possibilities, lurking in the depths of its promise are significant security risks that must not be overlooked. In the case of using ready-made tools, rigorous policies should be formulated to ensure safe usage. And in the case of in-house tool deployment, safety measures must be incorporated into the process to prevent manipulation and misuse. In both cases, the promises of technology must not blind companies to the very real threat to their sensitive and private information.
Jino Noel is a data science and technology leader with extensive experience in building data teams and practices across different organizations. His experience ranges from working in startups to large conglomerates across both Australia and the Philippines. At the time of this interview, he was the Chief Data Officer at Data Analytics Ventures, Inc. (DAVI). Currently, he is the Chief Data Officer at Angkas.
What are the key skills that a Chief Data Officer should possess nowadays?
A Chief Data Officer should have both data-related technical expertise as well as people leadership skills. Leading will always be part of the job, particularly for highly specialized technical people such as data engineers and data scientists. To be able to lead them properly, I believe it is better to be a technical person myself, so I can discuss technical matters fluently, which helps me gain their trust.
What data-related challenges have you faced as the Chief Data Officer of DAVI? How did you overcome these challenges?
Our data-related challenges are the same as any company. Being able to trust our data, cleaning up data from our sources, data latencies, and other related issues. DAVI overcame these by investing in people—hiring high-quality experts in our data engineering, data governance, and analytics teams to help us make sense of the data coming in—and building robust data pipelines that have increased the standard of quality of the data in our data lake.
How does DAVI make use of advancements in artificial intelligence (AI) and machine learning to help its clients understand their customers’ needs and buying patterns?
DAVI has recently started using machine learning to model our users’ propensity to buy certain products. This helps us create more accurate target audiences for our precision marketing campaigns. We are also moving forward with a recommendation engine project, with the goal of improving user engagement with our retail partners and with our promos and campaigns. On top of this, we are improving our machine learning operations expertise to make our model deployments repeatable and robust.
In the digital marketplace, data analytics acts as a guiding compass for app developers, enabling the creation of personalized, high-performing applications that align with user preferences. By leveraging data, developers can understand nuanced user behaviors and preferences, allowing them to tailor apps to meet specific user needs and aspirations.
Dive deeper into these discussions by reading Jino Noel’s full interview with The KPI Institute. Download the free digital copy of PERFORMANCE Magazine Issue No. 26, 2023 – Data Analytics on the TKI Marketplace. You can also purchase a physical copy via Amazon.
Alfonso Medela is the Chief Artificial Intelligence (AI) Officer at Legit.Health, where he oversees the use of advanced computer vision algorithms. A renowned expert in few-shot learning and medical imaging, his contributions include developing an algorithm capable of diagnosing over 232 skin conditions.
What are the key skills that a Chief AI Officer should possess in the context of your role at Legit.Health?
A Chief AI Officer at a medical organization like Legit.Health needs strong AI expertise, including extensive knowledge of machine and deep learning, and a profound understanding of medical data and healthcare to ensure precise algorithm development. Besides technical skills, strategic thinking and leadership are vital for guiding the AI team and aligning with company goals. Great communication and collaboration skills are also crucial for working effectively with different teams.
Can you describe your experience in developing and implementing AI strategies for computer vision applications, specifically in the context of diagnosing and treating skin pathologies? How have you leveraged AI to improve diagnosis accuracy and enable life-saving therapies?
Heading a team of specialists, we’ve developed advanced algorithms that accurately identify over 232 skin conditions and automate follow-ups for chronic skin conditions. Using deep learning techniques, our platform provides real-time diagnostic support to healthcare professionals, improving their accuracy and enabling early intervention. By collaborating with medical experts and continuously refining our algorithms, we are able to offer a powerful tool that empowers clinicians, transforming healthcare and improving patient outcomes.
What approaches or methodologies do you use to ensure the accuracy and reliability of computer vision algorithms in the context of skin pathology diagnosis? Can you share examples of how you have validated the performance of AI models and ensured their safety and effectiveness in real-world clinical settings?
To guarantee accuracy and reliability, our computer vision algorithms undergo a multi-stage validation process that encompasses retrospective and prospective clinical validations. Rigorous testing is performed on diverse, representative datasets, employing cross-validation to assess model performance. We collaborate closely with medical professionals, reviewing AI model outputs and gathering feedback to iteratively refine our algorithms. Furthermore, we conduct clinical trials and pilot studies to evaluate safety and efficacy. This ensures that our models adhere to real-world requirements and actively contribute to enhancing patient outcomes.
AI stands as one of the most transformative technologies of the modern era, revolutionizing the way people approach complex problems across various fields. From enhancing healthcare diagnostics to driving advancements in autonomous vehicles, AI’s potential is vast and continually expanding.
To explore the full spectrum of Alfonso Medela’s pioneering work in AI and to stay updated with the latest industry insights, read his full interview exclusively featured in the PERFORMANCE Magazine Issue No 26, 2023 – Data Analytics edition. Download your free copy now through the TKI Marketplace or purchase a printed copy from Amazon.
Harry Patria, the CEO of Patria & Co., is a data strategist and lecturer who founded a company that serves over 100 corporate clients, 200 analytical platforms, and 500 professionals. He is a Data Hackathon winner in the UK and graduated with distinction from his master’s degree to a PhD program with a fully-funded scholarship. Harry is a subject matter expert in several fields.
“The world’s most valuable resource is no longer oil, but data.”
That statement from The Economist in 2017 cannot be overstated. Businesses in all shapes and sizes must realize that adapting to an already data-driven world is the only way to survive, connect, and thrive.
Artificial Intelligence (AI) was introduced in the 1950s by a computer researcher named John McCarthy. He defined AI as “the science and engineering of making intelligent machines.”
Nowadays, innovation pioneers like Microsoft, Google, and IBM have made strides in AI advancement to back cloud analytics, client engagement, and more. AI has become a program outlined to complete tasks that would regularly require human capabilities or input. AI is considered an innovation that takes after or mirrors human insights and actions, including speech, reviewing pictures, or making a conversation. To a great extent, AI can do those things by recognizing designs inside the information and reacting based on pre-defined rationale.
On the other hand, big data is an extensive, fast, and diverse information resource that requires advanced forms of processing to improve decision making, knowledge generation, and process optimization.
Big data describes sets of information created in different formats and through different sources, such as software applications, IoT sensors, customer feedback surveys, videos, and images..
Big datasets are developed by collecting large amounts of information from real-time data streams, established databases, or legacy datasets. As the environment constantly changes and grows, we need powerful software to protect, classify, and explain information for both short-term and long-term use.
Organizations often use a combination of cloud-based applications and data warehousing tools to develop analytic architectures that collect, organize, and visualize data. AI-powered tools are central to tailoring many of these moving parts to consistent insights that support decision-making.
Linking Up Big Data and AI for Business
Implementing big data with AI has already been vital for many businesses that aim to have a competitive edge. It doesn’t really matter whether it is a new company or an established leader in the market. They use data-driven strategies to turn information into perceptible value. It is common to find big data in almost every industry, from IT and banking to agriculture and healthcare.
Business experts acknowledge that big data and AI can create new ideas for growth and expansion. There is even a possibility that a new type of business will become popular soon: data analysis and aggregation companies for particular industries. The purpose of those organizations is to process enormous flows of data and generate insights. Before this happens, businesses should empower their big data capabilities intensively. In the past, estimations were made based on the retroactive point of view. Leveraging real-time analysis, big data can empower predictions and allow strategists to test assumptions and theories faster.
Data and AI are typically applied to analytics and automation, helping businesses transform their operations in the process.
Analytics tools like Microsoft, Azure, and Synapse help organizations predict or identify trends that inform decision-making around product development, service delivery, workflows, and more. Additionally, your data will be organized into dashboard visualizations, reports, charts, and graphs for readability.
Big data and AI in Health
The global market for AI-driven health care is expected to register a CAGR of 40 percent through 2021 and toup from USD 600 million in 2014. Further advances in AI and big data provide developing countries with opportunities to solve existing challenges in the health care access of their populations. AI combined with robotics and IoMT could also help developing countries address healthcare problems and meet SDG 3 on good health and well-being. AI can be deployed in health training, keeping well, early disease detection, diagnosis, decision-making, treatment, end-of-life care, and health research. For instance, AI can outperform radiologists in cancer screening, particularly in patients with lung cancer. Results suggest that the use of AI can cut false positives by 11 percent.
Big data and AI in Agriculture
Today’s global population of 7.6 billion is expected to rise to 9.8 billion by 2050, with half of the world’s population growth concentrated in nine countries, such as India, Nigeria, the Democratic Republic of the Congo, Pakistan, Ethiopia, the United Republic of Tanzania, the United States of America, Uganda, and Indonesia.
The growing demand for food will put massive pressure on the use of water and soil. All of this will be exacerbated by climate change and global warming.
Big data and AI in Education
AI can reshape high-quality education and learning through precisely targeted and individually customized human capital investments. Incorporating AI into online courses enhances access to affordable education and improves learning and employment in emerging markets. Also, AI technologies can ensure equitable and inclusive access to education, providing marginalized people and communities, such as persons with disabilities, refugees, and those out of school or living in isolated communities, with access to appropriate learning opportunities.
Expected Economic Gains from AI Worldwide
AI could contribute up to USD 15.7 trillion to the global economy in 2030, more than the current GDP of China and India combined. Of this, USD 6.6 trillion will be derived from increased productivity and USD 9.1 trillion from the knock-on effects of consumption. The total projected impact for Africa and Asia-Pacific markets would be USD 1.2 trillion. For comparison, the combined 2019 GDP for all countries in sub-Saharan Africa was USD 1.8 trillion. Thus, the successful deployment of AI and big data would open up a world of opportunities for developing countries.
The big data market is expected to grow tremendously over the projected years. One of the important reasons is the rapid increase in the amount of structured and unstructured data. Factors include the increasing penetration of technology and the proliferation of smartphones in all areas of life. This leads to a large amount of data.
Other industries such as healthcare, utilities, and banking make extensive use of online platforms to provide enhanced services to their customers.
Intelligent use of big data in day-to-day operations enables you to make data-driven decisions and respond quickly to market trends that have a direct impact on business performance.
If you would like to learn more about the best practices for analyzing data, sign up for The KPI Institute’s Data Analysis Certification.