If AI software sounds like something that could help your enterprise, a good way to start is to perform analysis on a single business function and expand from there. Solutions exist that can handle a small analysis project and be useful if commitment to AI expands later.
In some ways, artificial intelligence (AI) still seems like a science-fiction concept, but it's not. AI packages exist today that can help enterprises do a better job of spotting problems, analyzing business conditions to figure out better strategies going forward, and automating predictive processes to provide better outcomes than simply relying on, for example, some CEO's "gut instinct." It may be helpful to those thinking about dipping a toe into this new technology to take a brief look at some of the major AI software packages already available.
The foundation for nearly all AI packages is machine learning (ML), a procedure by which a computer system is fed datasets and is trained to process them. The goal of ML is to teach computer systems how to learn and improve on their own, without a human having to explicitly program all the learning steps. Instead, a computer learns by experience.
The Major Types of Machine Learning
There are three major types of ML: supervised, unsupervised, and semi-supervised. Most ML systems can use all three, as specified by a human user.
Supervised ML starts with a known dataset that the system is trained to analyze via a model. The system uses an inferred function to predict output values of the particular dataset it's analyzing at the time. Those projected values are compared to the actual expected values, after which the system modifies the model it will use to analyze the next dataset. After numerous iterations, the system learns to modify its learning model to become more and more accurate until the model is trusted enough to analyze fresh data.
In unsupervised ML, the system is given a large amount of "unlabeled" data with no particular outcome already known. Labeled data includes some kind of embedded descriptive information (for example, data specified as "medical records") while unlabeled data might, for instance, simply be a series of x-ray images. By iterative methods similar to supervised learning, the system learns to draw inferences that let it predict hidden structures in uncategorized data.
Semi-supervised is a blend of the two other methods. Most often, inputs are a limited amount of labeled data and a large amount of unlabeled data. As such, this method tries to mimic human learning, in which children are given some information by parents and teachers and then go on to draw inferences from the large amount of unstructured data they encounter as a result of normal experiences.
Commercial AI Software Products
What follows is a brief overview of some of the major commercially available packages for data analysis and other functions that can be used standalone or incorporated into more-specific AI apps. A few term definitions will ease your way through the descriptions.
"Structured" data contains numbers and dates while "unstructured" data refers to textual sources. "Data transformation" refers to moving data from one format to a different supported format. "Predictive analytics" is the process of using historical data to forecast possible future outcomes. "Streaming data" is collecting data coming in concurrently from a large variety of sources. "Data engineering" is the practical process of collecting and validating datasets. "Data cleansing" is the process of preparing data for analysis by removing data from a dataset that may be incorrect, incomplete, corrupted, or otherwise possibly invalid.
A "data lake" refers to a single, large repository of data at any scale, and without a requirement of having to move all the data to a structured format before analysis can take place. In effect, data lakes can draw information from social media, Internet-connected devices, log files, and click streams in addition to structured databases. This makes data lakes potentially more flexible than data warehouses, which primarily pull information from business applications and transaction systems.
None of the descriptions should be interpreted as a complete view of each product's capabilities, nor any sort of relative ranking of each product's value.
Alteryx lets users blend together multiple sources to form a data-analysis platform. The algorithm is PC-based and runs on Windows 7 or higher to validate "the health, quality and statistical distribution" of data and is supported by an online community site. Experienced users can employ the data to build advanced analytics data models by using more than 50 prebuilt tools that function without coding, or by using self-written scripts in R and Python. Analysis can be augmented by geospatial intelligence to build and envision location-based calculations. There's an extensive tool set to build reports in a wide array of document formats (e.g., PDF, HTML, DOCX, XLSX) and deliver data directly to visualization formats like Microsoft Power BI, Tableau, or Qlik. Alteryx also enables natural-language inputs in English, French, German, Japanese, Portuguese, and Spanish.
Google Cloud AI Platform is a means of carrying out ML. Once this learning program is implemented, it's referred to as a "trained model." Trained models help users identify the validity of the data used to build them. The Cloud AI Platform helps users train, evaluate, tune, deploy, manage, monitor, and extract predictions from resulting data models. Via a "Cloud Console" feature, the Cloud AI Platform provides a UI for controlling ML functions and operations, making predictions from data, and issuing other commands to the system. The product has APIs for interfacing with programs written in Python.
H2O is an open-source ML platform that also provides other AI-related services, such as search and visualization tools for ad hoc data analysis, automatic modeling, report and dashboard generation, and the ability for users to construct their own AI apps that use augmented datasets. H2O is geared primarily for the financial, insurance, healthcare, marketing, manufacturing, and telecom industries. It includes modules that are currently being used for COVID-19 research.
IBM Decision Optimization Center is a central location to learn about IBM's family of products that help users build mathematical optimization models of business situations in order to make better decisions more quickly. Optimization models display the most important characteristics of problems users might be trying to solve by looking at the objective function of a business decision, variables that might affect that decision, and business constraints. The IBM ILOG CPLEX Optimization Studio provides a built-in Optimization Programming Language (OPL) (or alternatively works with other programs built in C, C++, C# APIs, Java, or Python) to build and deploy models that help identify the best actions in given business situations. IBM Decision Optimization for Watson Studio includes features such as a modeling assistant and visual dashboards to facilitate model building and what-if analysis tools to sort out results of multiple scenarios. IBM Watson Studio Premium for IBM Cloud Pak for Data is a combination of IBM products that help enterprises predict business outcomes, partly by submitting constructed models to the Apache Hadoop Engine for further analysis. The IBM Decision Optimization in IBM Watson Machine Learning helps users build and deploy optimization models in cloud environments.
IBM SPSS Modeler runs on PCs in either a client/server configuration or a standalone desktop. It mines data and analyzes text sources for information, as well as provides predictive analytics capabilities. The front end runs on PCs and the back end runs on servers using UNIX variants, Linux, or Windows. The modeler analyzes structured and unstructured data from sources such as files, survey data, operational databases, the IBM Cognos 8 Business Intelligence framework, and flat files such as IBM SPSS Statistics, SAS, and MS Excel files. Users can access predictive, data-transformative, testing, and reporting characteristics from the same interface. Professional edition includes tools for analyzing existing data. Premium edition adds a text-mining feature for retrieving concepts, relationships, and sentiments from text data, as well as converting unstructured data to a structured format.
IBM Watson Studio is available via the cloud and automates many data-preparation tasks, enables preparation of predictive models with a mix of visual tools, draws from most common data sources (e.g., spreadsheets, flat files, relational databases), and enables data display by facilitating data export into presentations that can use dozens of prebuilt chart types. Watson Studio also integrates with the IBM SPSS Modeler to access that product's features.
Infosys Nia is an open-source platform that absorbs information about business processes and represents them in a summarizing structure. Available via public, private, or hybrid clouds, Nia works with browsers such as IE V7-11, Firefox, Chrome, Safari V9 or better, and Microsoft Edge. Nia has a discovery subsystem that can summarize what information is available from enterprise databases, as well as a learning subsystem that can take inputs about new apps from potential users via natural language documents (such as memos), as well as machine-learning that can assimilate information from other data sources. Other subsystems provide runtime capabilities, self-healing responses to problems, and an automation platform for predictive automation (building models of automation processes), cognitive automation (building processes that mimic human behavior), and robotic process automation (programming robotic machinery functions).
MathWorks MATLAB "combines a desktop environment tuned for iterative analysis and design processes with a programming language that expresses matrix and array mathematics directly," according to Mathorks. The product lets developers, for example, run different sample algorithms against available data to see what results they produce. The Live Editor function generates scripts that can blend code, output, and formatted text. A Data Analysis function can compile, cleanse, and analyze multiple datasets; includes prebuilt widgets for signal processing, machine learning, and statistical analysis; and generates sharable analysis reports. There's also a drag-and-drop App Builder for generating GUIs and specifying app behavior, a Plot Gallery with dozens of standard and customizable means of graphing results, and APIs that let MATLAB be called by other apps written in C, C++, Fortran, Java, Python, and apps using some COM components, such as Visual C# .NET and Visual Basic .NET.
Qubole works with data lakes. It's cloud-based and provides end-to-end services that help users with ML, ad hoc data analytics, and streaming data. Data-management tools help users manage metadata and infrastructure, reveal statistics and data dependencies, and automate control of clustered resources, all across cloud environments. The platform also offers automated continuous data engineering.
RapidMiner, from the company of the same name, is a platform that aims at helping both data scientists and less-technical corporate end users. It's an automated data-science platform that provides functions such as data analysis, data cleansing and transformation, model deployment and optimization, prebuilt use templates, and tutorials for the inexperienced user. RapidMiner integrates with custom code written in Python or R, supports any third-party ML libraries, and includes more than 1500 scripts that provide individual data-science and data-preparation functions.
Symphony AyasdiAI is an application framework designed primarily for financial applications and uses an engine called Topological Data Analysis (TDA), which is based on the mathematical concept of topology. Topology is the idea that data has an underlying shape that gives the data meaning. TDA adapts this method to analyze highly complex data. Ayasdi builds compressed diagrams of data points to display important patterns as a way of showing users geometric relationships that may exist between data points. It combines with other machine-learning algorithms to find patterns in data to generate insights for users. Ayasdi is currently used in situations such as combatting money laundering and bank fraud, promoting healthcare institution cost reductions and reducing health insurance claim refusals, and improving general performance of other software applications.
TensorFlow is an open-source library of ML and neural networking algorithms that functions as an end-to-end platform for developing and training ML models. It provides tools for managing ML environments and APIs for defining and training ML models, as well as using the data to make predictions. TensorFlow uses Python to help users build front-ends for analytical apps and builds the apps themselves in C++. Its apps can run on local PCs, iOS and Android devices, a public cloud, and other CPUs. By handling algorithm deployment and implementation, as well as connecting outputs to the next function looking for an input, TensorFlow frees developers to concentrate on an application's overall design instead.
Wipro HOLMES is a proprietary AI and ML platform that helps accelerate existing business processes by automating them. Its name is a reference to IBM's Watson, but it's an independent platform that can run above existing business applications to coordinate business functions such as finance, human resources, legal, marketing, operations, procurement, and regulatory compliance. It contains separate modules for use by COOs, CFOs, CLOs, and procurement officers.
There's No One-Size-Fits-All AI Product
Like any kind of software, none of these alternatives is likely to be perfect for your situation. All of them offer something but many are specialized for certain kinds of businesses. Adding to the potential confusion is that your enterprise may not know today what kind of AI would best suit its particular profile and ways of doing business. As is true in ML, an enterprise's learning curve about AI will only get to a good outcome via experience. That experience will start with a first step in some direction, and any of the products above will help you begin that learning process.
LATEST COMMENTS
MC Press Online