Models Archives - Artificial Intelligence

What is Keras and Use Cases of Keras?

vijay — Wed, 22 Jan 2025 05:44:48 +0000

Keras is a high-level deep learning framework that provides an easy-to-use interface for building, training, and deploying deep learning models. It is written in Python and can run on top of popular deep learning backends like TensorFlow, Theano, and Microsoft Cognitive Toolkit (CNTK). Keras is designed with user-friendliness, modularity, and extensibility in mind, making it a go-to tool for researchers, engineers, and developers exploring artificial intelligence and machine learning.

What is Keras?

Keras simplifies building deep learning models by abstracting the complexities of backend computations. It allows users to focus more on designing and iterating models rather than dealing with the low-level details of tensor operations. Some of its key characteristics include:

User-Friendly: Offers a clean and intuitive API for fast prototyping.
Extensibility: Supports custom layers, metrics, and loss functions.
Cross-Platform Compatibility: Runs seamlessly on CPUs, GPUs, and TPUs.
Wide Adoption: Used in academic research, industrial applications, and startups worldwide.

Top 10 Use Cases of Keras

Image Classification: Keras is widely used for training models that classify images into categories, such as identifying objects in pictures or detecting facial expressions.
Natural Language Processing (NLP): Used for tasks like text classification, sentiment analysis, and language translation.
Speech Recognition: Enables the development of speech-to-text systems and voice assistants.
Recommendation Systems: Powers personalized recommendations, such as those used by e-commerce and streaming platforms.
Healthcare Applications: Assists in medical imaging diagnostics, such as detecting anomalies in X-rays or MRIs.
Anomaly Detection: Detects fraudulent activities, such as credit card fraud, and irregular patterns in financial transactions.
Time Series Analysis: Used for forecasting trends, stock prices, and weather patterns.
Autonomous Driving: Facilitates perception and decision-making systems for self-driving cars.
Generative Models: Enables the creation of realistic images, videos, and music using GANs (Generative Adversarial Networks).
Robotics: Helps robots learn motor skills, object recognition, and navigation.

What are the Features of Keras?

Modularity: Components like layers, loss functions, and optimizers are fully modular and easily configurable.
Pretrained Models: Provides access to a library of pretrained models such as VGG, ResNet, and Inception for transfer learning.
Multiple Backend Support: Works with TensorFlow, Theano, or CNTK for flexibility in deployment.
Extensive Documentation: Offers detailed guides and examples for beginners and advanced users.
Customizability: Supports building custom neural network architectures and layers.
Built-in Support for GPUs: Accelerates model training by utilizing GPU hardware.
Integration with Other Libraries: Compatible with NumPy, Pandas, and Matplotlib for preprocessing and visualization.
Easy Debugging: Errors are reported with clear and helpful messages, simplifying the debugging process.

How Keras Works and Its Architecture

Keras provides a high-level abstraction for building neural networks. Its architecture consists of the following key components:

Models:
- Sequential API: Allows building models layer-by-layer.
- Functional API: Facilitates creating complex models with multiple inputs and outputs.
Layers: Layers are the building blocks of Keras models and can be stacked to create a neural network.
Backend Engine: Keras uses a backend engine (e.g., TensorFlow) to perform numerical computations.
Loss Functions: Specifies the objective that the model should optimize.
Optimizers: Algorithms like SGD, Adam, and RMSProp adjust model weights to minimize the loss function.
Metrics: Evaluate the model’s performance during training and testing.

How to Install Keras

To install Keras, follow these steps:

1.Install a Backend Framework: Keras requires a backend engine. You can install TensorFlow, JAX, or PyTorch. For example, to install TensorFlow, use:

pip install tensorflow

2.Install Keras: Once the backend is installed, install Keras using pip:

pip install keras

3. Verify Installation: To confirm successful installation, run the following in Python:

import keras
print(keras.__version__)

This should display the installed Keras version without errors.

Basic Tutorials of Keras: Getting Started

To get started with Keras, consider the following steps:

1.Import Necessary Libraries:

import keras
from keras.models import Sequential
from keras.layers import Dense

2.Load and Preprocess Data: Keras provides utilities to load datasets like MNIST. Preprocess the data by normalizing and reshaping as required.

3. Build the Model:

model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(input_dim,)))
model.add(Dense(10, activation='softmax'))

4. Compile the Model: Specify the optimizer, loss function, and metrics:

model.compile(optimizer='adam', loss='sparse

The post What is Keras and Use Cases of Keras? appeared first on Artificial Intelligence.

IMPLEMENTING AI MODELS HAS MADE CRITICAL DISEASE DIAGNOSIS EASY

aiuniverse — Thu, 08 Jul 2021 09:48:06 +0000

Source – https://www.analyticsinsight.net/

AI applications are becoming the one-stop solution for diagnosing critical diseases

Artificial intelligence and machine learning, are dominating every aspect of our lives. AI is used in various areas like healthcare, education, and defense. With the advancement of technology, better computing power, and the availability of large datasets containing valuable information, the use of AI and ML models has increased. The healthcare sector generates enormous amounts of data in terms of images, and patient data, which helps the healthcare companies to understand the patterns and make predictions.

Artificial intelligence is capable of predicting acute critical illness with greater accuracy than the traditional early warning system (EWS), primarily used by healthcare providers. Even though AI is used in healthcare companies for various purposes but predicting critical diseases and their risks beforehand have been one of its greatest contributions.

Recently, researchers and healthcare providers have been using machine learning algorithms to automate the diagnosis of critical diseases like cancer, and other cardiovascular complexities, which has caused a paradigm shift in healthcare facilities. They are using ML models for the real-time diagnosis of disease by developing mobile applications. Some mobile apps can even predict the risk of a certain disease in the future and recommend a diagnosis based on the individual’s medical history and other habits.

Even though machine learning and artificial intelligence have brought a revolutionary change in medical facilities, efficient early detections and diagnosis are still a problem.

Transparency and explainability, are of absolute importance when it comes to the widespread introduction of AI models into clinical practices. Incorrect predictions carry serious consequences. Healthcare providers must understand the underlying reasoning and technical patterns followed by the application to understand potential cases where it might end up with false or incorrect predictions. AI-based early warning systems carry robust and accurate models to predict acute critical diseases.

Acceleration Of Artificial Intelligence In The Healthcare Industry
A Surge In The Adoption Of AI By The Healthcare Sector
Analytics Insight Predicts Healthcare Sector To Touch US$68 Billion In Revenue By 2025

Using Electronic Health Records for Efficient Prediction of Critical Diseases

Electronic health records contain information for both medical providers and patients. These records also contain information that could interfere with the machine’s ability to make correct predictions. Researchers are aiming to eliminate the unnecessary data that can hinder the model’s capability by deploying a machine learning algorithm, called LSAN.

LSAN, is a deep neural network that uses the two-pronged approach to scan electronic health records and identify information that could predict if the patient is facing a risk of developing a deadly disease in the future.

Electronic records use a double-level hierarchical structure to interpret the medical journey of a patient using the International Classification of Diseases (ICD) codes. It begins with the patient’s current situation and follows through the chronological sequence of visits made by the patient. It records the symptoms and the patient’s condition from the last visit to the current state.

Researchers conducted experiments on patients with symptoms of health failure, kidney disease, and dementia and determined that this newly developed machine learning model called LSAN has outperformed the traditional and the current medical technologies and deep learning models.

These models and tools can be effectively used to predict cardiovascular diseases using the patient’s age, cholesterol, weight, blood pressure, and several other factors, and the potential risks that might occur in the next ten years. Hospitals are also increasing the use of business analytics in transportation, patient retention, and other areas to provide the patients a wholesome experience and cost-effective treatments.

The application of AI in this diagnostic process can be of immense support to healthcare providers and patients. The implementation of AI in the medical infrastructure speeds up the identification of relevant medical data from multiple sources, which saves time and resources for the patients and medical practitioners.

The post IMPLEMENTING AI MODELS HAS MADE CRITICAL DISEASE DIAGNOSIS EASY appeared first on Artificial Intelligence.

KEY TO SCALE IN VOLATILE MARKETS: RE-INVENTING BUSINESS MODELS WITH THE SUCCESSFUL DEPLOYMENT OF AI AND DATA SCIENCE BY ANEES MERCHANT EXECUTIVE VICE PRESIDENT – APPLIED AI & DIGITAL AT COURSE5 INTELLIGENCE

aiuniverse — Fri, 25 Jun 2021 09:51:34 +0000

Source – https://www.analyticsinsight.net/

Organisations often block their own path towards scaling by delaying innovations purely out of their loyalty to their core businesses. Unfortunately, fixed organisational structures and legacy operating models result in frailty, disabling the sight of potential market changes. Enterprises hesitate to build products or services with new technology as they are unsure of whether the growth rates would satisfy their shareholders.

Today, to compet in a VUCA environment, businesses and their change agents must consider relooking and reconsidering how they engage and conduct business with their customers and leverage innovative technologies to enhance or introduce products/services to the market.

With rapidly changing consumer buying patterns and preferences, enterprises are increasingly focusing on digitization to evolve their business models, they are being mindful of the efficiency of business operating units, which enables them to pinpoint areas that need additional focus or restructuring quickly. Some of the possible initiatives organisations can opt for are:

Growth hacking:Organizations need to be agile and iterative in their approach to keep up with the changes in the industry. Having a growth mindset can enable management, embrace challenges, show resilience while working through obstacles, and bounce back from impediments sooner, leading to overall higher achievement. Organizations backed by a growthhacking mindset will be ableto foster innovation and generate higher financial returns.
Retooling organization:Companies should ‘avoid putting all eggs” in one basket regarding technology infrastructure. The focus should be on adopting technology like building a Lego structure, where individual components are replaced if the scalability and validity for the current and future needs of the business aren’t met.
Adopting new age innovation rather reinventing: Applications of AI is evolving within the industry at a fast paced, which enables organizations to evaluate quickly, adapt, pilot and scale within the organizations. Reinventing AI would mean wastage of resources of time, instead organization precious resources can be spent on identifying the right opportunity to evaluate and scale the benefits of AI.The global COVID-19 pandemic has crushed standards and redefined how business is conducted, affecting most enterprises in some way or another. At the same time, enterprises were already leveraging data science and AI in the past few years. A significantly greater number of organizations are now looking for ways to harness them to reinvent themselves. Key-focused areas remain in strategy building, decision-making and governance setup, business planning and budgeting, funding decision making, managing performance and company culture, risk management, and more.For businesses, resiliency will become even more significant than efficiency as they move forward and data science will help companies maintain. For instance, retail stores and restaurants that were more dependent on brick-and-mortar sales before the pandemic had to make drastic changes to survive and sustain. While some were forced to shut shop, the rest kept steering ahead with new business models to adapt and thrive. Data science helped companies stabilise their organisations, build new processes, establish new communication channels and workflow, adapt to the remote working environment, recognise (and adapt to) changing consumer patterns and identify the emerging trends by using AI and machine learning.Traditionally, legacy companies used to focus only on their core business. With the new wave of transformation and new opportunity post the pandemic, these prominentestablished players are reinventing themselves and creating businesses in new areas with a very different mindset and culture than their traditional organizations.The new digital era demands asignificant change in traditional thinking and focusing on the practical approach of collaboration, competition, and innovationthat can combine data science, AI,and business acumen to conceive, build and bring new digital products to market at scale.

The post KEY TO SCALE IN VOLATILE MARKETS: RE-INVENTING BUSINESS MODELS WITH THE SUCCESSFUL DEPLOYMENT OF AI AND DATA SCIENCE BY ANEES MERCHANT EXECUTIVE VICE PRESIDENT – APPLIED AI & DIGITAL AT COURSE5 INTELLIGENCE appeared first on Artificial Intelligence.

MODEL SEARCH: A PLATFORM FOR FINDING OPTIMAL ML MODELS

aiuniverse — Mon, 15 Mar 2021 06:25:04 +0000

Source – https://www.analyticsinsight.net/

As known to many, Google has recently released Model Search which is an open-source platform. This caters to developing efficient and best machine learning models automatically. Rather than focusing on a particular domain, Model Search is domain agnostic and flexible beyond imagination. Well, not just that. To our surprise, it is even capable of finding just the right architecture. With this, it will best fit a given dataset and the associated problem. At the same time, it is mastered enough to minimize the time that goes behind in coding, the effort as well as the resources that are put in.

Model Search is built on Tensorflow. It is flexible to the extent that it can run either on a single machine or in a distributed setting. This feature does set it apart from the rest, without any doubt. It is equipped with multiple trainers, a search algorithm, a transfer learning algorithm and a database as well that aims at storing the various evaluated models.

The Architecture of Model Search

Talking about the architecture of Model Search, it is based on four foundational components: They are –

Model Trainers:As evident as it can get, these components cater to training and evaluating the various models asynchronously.
Search Algorithms: With this component of a search algorithm, it is possible to select the best trained architectures. It doesn’t end there. There’s also an option for the user to add some “mutations” to it which can be sent to the trainers for further evaluation.
Transfer Learning Algorithm:Model Search boasts of using transfer learning techniques such as knowledge distillation. This further brings in an advantage of reusing knowledge across different experiments. Model Search enables this using two ways – knowledge distillation or weight sharing. The former allows improving the accuracy by adding a loss term that matches the predictions of the high-performing models. On the other hand, the latter bootstraps some of the network’s parameters from previously trained candidates.
Model Database:This is where the results of the experiments can be persisted in such a way that it can be reused on different cycles.

What Makes Model Search So Unique?

Now, that one aspect which makes people look forward to Model Search is its ability to run training and evaluation experiments for AI models in an adaptive and asynchronous fashion. This ability paves the way for the trainers to share the knowledge that they’ve gained from their experiments. The working is such that at the beginning of every cycle, the search algorithm closely monitors all the completed trials. After this, what follows is deciding what to try next. This platform makes use of beam search while deciding what is to be tried next. The next step that follows is to invoke mutation and assign the resulting model back to a trainer.
When the users go for a Model Search run, they are in a position to compare the many models found during the search. Well, there’s more to this. The platform also allows the users to create their own search space. With this, it is possible for them to customize the architectural elements in their models.
The researchers have also claimed that Model Search improves upon production models with minimal iterations.

All in all, The Model Search Code aims to provide the researchers with a flexible, domain-agnostic framework to develop the most efficient and best machine learning model. Also, the framework is so powerful that it can build models with state-of-the-art performance. It also has the capability to deal with well-known problems when provided with a search space composed of standard building blocks.

The post MODEL SEARCH: A PLATFORM FOR FINDING OPTIMAL ML MODELS appeared first on Artificial Intelligence.

Robust Data-Driven Machine-Learning Models for Subsurface Applications: Are We There Yet?

aiuniverse — Mon, 01 Mar 2021 06:57:54 +0000

Source – https://jpt.spe.org/

Algorithms are taking over the world, or so we are led to believe, given their growing pervasiveness in multiple fields of human endeavor such as consumer marketing, finance, design and manufacturing, health care, politics, sports, etc. The focus of this article is to examine where things stand in regard to the application of these techniques for managing subsurface energy resources in domains such as conventional and unconventional oil and gas, geologic carbon sequestration, and geothermal energy.

Srikanta Mishra and Jared Schuetter, Battelle Memorial Institute; Akhil Datta-Gupta, SPE, Texas A&M University; and Grant Bromhal, National Energy Technology Laboratory, US Department of Energy

It is useful to start with some definitions to establish a common vocabulary.

Data analytics (DA)—Sophisticated data collection and analysis to understand and model hidden patterns and relationships in complex, multivariate data sets
Machine learning (ML)—Building a model between predictors and response, where an algorithm (often a black box) is used to infer the underlying input/output relationship from the data
Artificial intelligence (AI)—Applying a predictive model with new data to make decisions without human intervention (and with the possibility of feedback for model updating)

Thus, DA can be thought of as a broad framework that helps determine what happened (descriptive analytics), why it happened (diagnostic analytics), what will happen (predictive analytics), or how can we make something happen (prescriptive analytics) (Sankaran et al. 2019). Although DA is built upon a foundation of classical statistics and optimization, it has increasingly come to rely upon ML, especially for predictive and prescriptive analytics (Donoho 2017). While the terms DA, ML, and AI are often used interchangeably, it is important to recognize that ML is basically a subset of DA and a core enabling element of the broader application for the decision-making construct that is AI.

In recent years, there has been a proliferation in studies using ML for predictive analytics in the context of subsurface energy resources. Consider how the number of papers on ML in the OnePetro database has been increasing exponentially since 1990 (Fig. 1). These trends are also reflected in the number of technical sessions devoted to ML/AI topics in conferences organized by SPE, AAPG, and SEG among others; as wells as books targeted to practitioners in these professions (Holdaway 2014; Mishra and Datta-Gupta 2017; Mohaghegh 2017; Misra et al. 2019).

Given these high levels of activity, our goal is to provide some observations and recommendations on the practice of data-driven model building using ML techniques. The observations are motivated by our belief that some geoscientists and petroleum engineers may be jumping the gun by applying these techniques in an ad hoc manner without any foundational understanding, whereas others may be holding off on using these methods because they do not have any formal ML training and could benefit from some concrete advice on the subject. The recommendations are conditioned by our experience in applying both conventional statistical modeling and data analytics approaches to practical problems. To that end, we ask and (try to) answer the following questions:

Why ML models and when?
One model or many?
Which predictors matter?
Can data-driven models become physics-informed?
What are some challenges going forward?

Why ML Models and When?

Historically, subsurface science and engineering analyses have relied on mechanistic (physics-based) models, which include a causal understanding of input/output relationships. Unsurprisingly, experienced professionals are wary of purely data-driven black-box ML models that appear to be devoid of any such understanding. Nevertheless, the use of ML models is easy to justify if the relevant physics-based model is computation intensive or immature or a suitable mechanistic modeling paradigm does not exist. Furthermore, Holm (2019) posits that, even though humans cannot assess how a black-box model arrives at a particular answer, such models can be useful in science and engineering in certain cases. The three cases that she identifies, and some corresponding oil and gas examples, follow.

When the cost of a wrong answer is low relative to the value of a correct answer (e.g., using an ML-based proxy model to carry out initial explorations in the parameter space during history matching, with further refinements in the vicinity of the optimal solution applied using a full-physics model)
When they produce the best results (e.g., using a large number of pregenerated images to seed a pattern-recognition algorithm for matching the observed pressure derivative signature to an underlying conceptual model during well-test analysis)
As tools to inspire and guide human inquiry (e.g., using operational and historical data for electrical submersible pumps in unconventional wells to understand the factors and conditions responsible for equipment failure or suboptimal performance and perform preventative maintenance as needed)

It should be noted that data-driven modeling does not preclude the use of conventional statistical models such as linear/linearized regression, principal component analysis for dimension reduction, or cluster analysis to identify natural groupings within the data (in addition to, or as an alternative to, black-box models). This sets up the data-modeling culture vs. algorithm-modeling culture debate as first noted by Breiman (2001). In our view, the two approaches can and should coexist, with ML methods being preferred if they are clearly superior in terms of predictive accuracy, albeit often at the cost of interpretability. If both approaches provide comparable results at comparable speeds, then conventional statistical models should be chosen because of their transparency.

One Model or Many?

Although the concept of a single correct model has been conventional wisdom for quite some time, the practice of geostatistics has influenced the growing acceptance that multiple plausible geologic models (and their equivalent dynamic reservoir models) can exist (Coburn et al. 2007). This issue of nonuniqueness can be extended readily to other application domains such as drilling, production, and predictive maintenance. The idea of an ensemble of acceptable models simply recognizes that every model—through its assumptions, architecture, and parameterization—has a unique way of characterizing the relationships between the predictors and the responses. Furthermore, multiple such models can provide very similar fits to training or test data, although their performance with respect to future predictions or identification of variable importance can be quite different.

Much like a “wisdom of crowds” sentiment for decision-making at the societal level, ensemble modeling approaches combine predictions from different models with the goal of improving predictions beyond what a single model can provide. They have also routinely appeared as top solutions to the well-known Kaggle data analysis competitions. Approaches for model aggregation may include a simple unweighted average of all model predictions or a weighted average based on model goodness of fit (e.g., root-mean-squared error or a similar error metric). Alternatively, multiple model predictions can be combined using a process called stacking, where a set of base models are used to predict the response of interest using the original inputs, and then their predictions are used as predictors in a final ML-based model, as shown in the work flow of Fig. 2 (Schuetter et al. 2019).

Given that there is no a priori way to choose the best ML algorithm for a problem at hand, at least in our experience, we recommend starting with a simple linear regression or classification model (ideally, no ML model should underperform this base model). This would be supplemented by one or more tree-based models [e.g., random forest (RF) or gradient boosting machine (GBM)] and one or more nontree-based models [e.g., support vector machine (SVM) or artificial neural network (ANN)]. Because of their architecture, tree-based models can be quite robust, sidestepping many issues that tend to plague conventional statistical models (e.g., monotone transformation of predictors, collinearity, sensitivity to outliers, and normality assumptions). They also tend to produce good performance without excessive tuning, so they are generally easy to train and use. Models such as SVM and ANN require more effort to implement—in the former case, because of the need to be more careful with predictor representation and outliers, and, in the latter case, because of the large number of tuning parameters and resources required; however, they traditionally also have shown better performance.

The suite of acceptable models, based on a goodness-of-fit threshold, would then be combined using the model aggregation concepts described earlier. The benefits would be robust predictions as well as ranking of variable interactions that integrate multiple perspectives.

Which Predictors Matter?

For black-box models, we strongly believe that it is not just sufficient to obtain the model prediction (i.e., what will happen) but also necessary to understand how the predictors are affecting the response (i.e., why will it happen). At some point, every model should require human review to understand what it does because (a) all models are wrong (thanks, George Box), (b) all models are based on assumptions, and (c) humans have a tendency to be overconfident in models and use them even when those assumptions are violated. To that end, answering the question “Which predictors matter?” can help provide some inkling into the inner workings of the black-box model and, thus, addresses the issue of model interpretability. In fact, one of the biggest pushbacks against the widespread adoption of ML models is the perception of lack of transparency in the black-box modeling paradigm (Holm 2019). Therefore, it is important to ensure that a robust approach toward determining (and communicating) variable importance is an integral element of the work flow for data-driven modeling using ML methods.

A review of the subsurface ML modeling literature suggests that ranking of input variables (predictors) with respect to their effect on the output variable of interest (response) seems to be carried out sporadically and mostly when the ML algorithm used in the study happens to include a built-in importance metric (as in the case of RF, GBM, or certain ANN implementations). In our experience, it is more useful to consider a model-agnostic variable-importance strategy, which also lends itself to the ensemble modeling construct. This can help create a meta-ranking of importance across multiple plausible models (much like using a panel of judges in a figure skating competition).

As Schuetter et al. (2018) have shown, the importance rankings may fluctuate from model to model, but, collectively, they provide a more robust perspective on the relative importance of predictors aggregated across multiple models. Some of those model-independent

importance-ranking approaches, as explained in detail in Molnar (2020), are summarized in Table 1. We have found the Permute approach to be the most robust and easy to implement and explain without incurring any significant additional computational burden beyond the original model fitting process.

Can Data-Driven Models Become Physics-Informed?

Standard data-driven ML algorithms are trained solely based on data. To ensure good predictive power, the training typically requires large amounts of data that may not be readily available, particularly during early stages of field development. Even if adequate data are available, there often is difficulty in interpreting the results or the results may be physically unrealistic. To address these challenges, a new class of physics-informed ML is being actively investigated (Raissi et al. 2019). The loss function in a data-driven ML (such as ANN) typically consists of only the data misfit term. In contrast, in the

physics-informed neural network (PINN) modeling approaches, the models are trained to minimize the data misfit while accounting for the underlying physics, typically described by governing partial differential equations. This ensures physically consistent predictions and lower data requirements because the solution space is constrained by physical laws. For subsurface flow and transport modeling using PINN, the residual of the governing mass balance equations is typically used as the additional term in the loss function.

For illustrative purposes, Fig. 3 shows 3D pressure maps in an unconventional reservoir generated using the PINN approach and comparison with a standard neural network (NN) approach. To train the PINN, the loss function here is set as L=Ld+Lr, where Ld is the data misfit in terms of initial pressure, boundary pressure, and gas production rate and Lr is the residual with respect to the governing mass-balance equation that is specified using a computationally efficient Eikonal form of the original equations (Zhang et al. 2016). Almost identical results are obtained using the PINN and standard NN in terms of matching the gas production rate. However, the pressure maps generated using the PINN show close agreement with 3D numerical simulation results, whereas the standard NN shows pressure depletion over a much larger region. Furthermore, the predictions using the PINN are two orders of magnitude faster than the 3D numerical simulator for this example.

What Are Some Key Challenges Going Forward?

Next, we address some of the lingering questions and comments that commonly have been raised during the first author’s SPE Distinguished Lecture question and answer sessions, in industry/research oriented technical forums related to ML, and from conversations with decision-makers and stakeholders.

“Our ML models are not very good.” Consumer marketing and social-media entities (e.g., Google, Facebook, Netflix) are forced to use ML/AI models to predict human behavior because there is no mechanistic modeling alternative. There is a general (but mistaken) perception in our industry that these models must be highly accurate (because they are used so often), whereas subsurface ML models can show higher errors depending on the problem being solved, the size of training data set, and the inclusion of relevant causal variables. We need to manage the (misplaced) expectation about subsurface ML models having to provide near-perfect fits to data and focus more on how the data-driven model can serve as a complement to physics-based models and add value for decision-making. Also, the application of ML models in predictive mode for a different set of geological conditions (spatially) or extending into the future where a different flow regime might be valid (temporally) should be treated with caution because data-driven models have limited ability to project the unseen. In other words, past may not always be prologue for such models.

“If I don’t understand the model, how can I believe it?” This common human reaction to anything that lies beyond one’s sphere of knowledge can be countered by a multipronged approach: (a) articulating the extent to which the predictors span the space of the most relevant causal variables for the problem of interest, (b) demonstrating the robustness of the model with both training and (cross) validation data sets, (c) explaining how the predictors affect the response to provide insights into the inner workings of the model by using variable importance and conditional sensitivity analysis (Mishra and Datta-Gupta 2017), and (d) supplementing this understanding of input/output relationships through creative visualizations.

“We are still looking for the ‘Aha!’ moment. Another common refrain against ML models is that they fail to produce some profound insights on system behavior that were not known before. There are times when a data-driven model will produce insights that are novel, whereas, in other situations, it will merely substantiate conventional subject-matter expertise on key factors affecting the system response. The value of the ML model in either case lies in providing a quantitative data-driven framework for describing the input/output relationships, which should prove useful to the domain expert whenever a physics-model takes too long to run, requires more data than is readily available, or is at an immature or evolving state.

“My staff need to learn data science, but how?” There appears to be a grassroots trend where petroleum engineers and geoscientists are trying to reinvent themselves by informally picking up some knowledge of machine learning and statistics from open sources such as YouTube videos, code and scripts from GitHub, and online courses. Following Donoho (2017), we believe that becoming a citizen data scientist (i.e., one who learns from data) requires more—that is, formally supplementing one’s domain expertise with knowledge of conventional data analysis (from statistics), programming in Python/R (from computer science), and machine learning (from both statistics and computer science). Organizations, therefore, should promote a training regime for their subsurface engineers and scientists that provides such competencies.

In the context of technology advancement and workforce upskilling, it is worth pointing out a recently launched initiative by the US Department of Energy known as Science-Informed Machine Learning for Accelerating Real-Time Decisions in Subsurface Applications (SMART) (https://edx.netl.doe.gov/smart/) . This initiative is funded by DOE’s Carbon Storage and Upstream Oil and Gas Program and has three main focus areas:

Real-time visualization—to enable dramatic improvements in the visualization of key subsurface features and flows by exploiting machine learning to substantially increase speed and enhance detail
Real-time forecasting—to transform reservoir management by rapid analysis of real-time data and rapid forward prediction under uncertainty to inform operational decisions
Virtual learning—to develop a computer-based experiential learning environment to improve field development and monitoring strategies

The SMART team is engaging with university, national laboratory, and industry partners and is building off of ongoing and historical data collected from DOE-supported field laboratories and regional partnerships and initiatives since the early 2000s. A key area of experimentation within SMART is the use of deep-learning techniques (e.g., convolution and graph neural networks, auto encoder/decoder, long short-term memory) for building 3D spatiotemporal data-driven models on the basis of field observations or synthetic data.

Epilogue

The buzz surrounding DA and AI/ML from multiple business, health, social, and applied science domains has found its way into several oil and gas (and related subsurface science and engineering) applications. Within our area of work, there is significant ongoing activity related to technology adaptation and development, as well as both informal and formal upskilling of geoenergy professionals to create citizen data scientists. The current status of this field, however, can best be classified as somewhat immature; it reminds us of the situation with geostatistics in the early 1990s, when the potential of the technology was beginning to be realized by the industry but was not yet fully adopted for mainstream applications.

To that end, we have highlighted several issues that should be properly addressed for making data-driven models more robust (i.e., accurate, efficient, understandable, and useful) while promoting foundational understanding of ML-related technologies among petroleum engineers and geoscientists. We believe that an appropriate mindset should be not to treat these data-driven modeling problems as merely curve-fitting exercises using very flexible and powerful algorithms easily abused for overfitting but to try to extract insights based on data that can be translated into actionable information for making better decisions. As the poet T.S. Eliot has said: “Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?” By extension, where is the information that is hiding in our data? May these thoughts help guide our journey toward better ML-based data-driven models for subsurface energy resource applications.

The post Robust Data-Driven Machine-Learning Models for Subsurface Applications: Are We There Yet? appeared first on Artificial Intelligence.

How to avoid overfitting in machine learning models

aiuniverse — Thu, 19 Nov 2020 06:17:09 +0000

Source: searchenterpriseai.techtarget.com

Overfitting is a common modeling error all enterprises who deploy machine and deep learning will encounter. When machine learning models allow noise, random data or natural fluctuations in data to dictate target functions, the model suffers in its ability to generalize new data beyond its training set.

Data scientists must build their models carefully, train them effectively, improve their machine learning literacy and instill incentive systems to combat overfitting.

Defining an overfitted model

Training machine learning and deep learning models is rife with potential failure — a major issue being overfitting. Generally, overfitting is when a model has trained so accurately on a specific dataset that it has only become useful at finding data points within that training set and struggles to adapt to a new set.

In overfitting, the model has memorized what patterns to look for in the training set, rather than learned what to look for in general data. To a data scientist, the model appears to be trained effectively, but when brought against other datasets it is unable to repeat this success.

“We want the model to generalize well to unseen data,” Benyamin Ghojogh, a graduate research assistant of machine learning and PhD candidate at the University of Waterloo, said.

For Ghojogh, avoiding overfitting requires a delicate balance of giving the right amount of details for the model to look for and train on, without giving too little information that the model is underfit.

“Something in the middle is good,” Ghojogh said. “And that’s a perfect fit, which can generalize to the new data and seen data.”

But this is a difficult point to get your models to and one that becomes even more of an issue as your models become more complex.

“If you make the model complex but you don’t have sufficient data to train it, the model will definitely get overfit and that’s why neural networks are prone to it,” Ghojogh said.

Ghojogh pointed out that all models must tackle the problem with overfitting, especially if the organization trains the model too much.

Rooting out overfitting in enterprise models

While getting ahead of the overfitting problem is one step in avoiding this common issue, enterprise data science teams also need to identify and avoid models that have become overfitted. But just as overfitting becomes more common as models grow in complexity, it also becomes more difficult to identify. Organizations who deploy neural networks and deep learning especially struggle with this step.

Forrester analyst Kjell Carlsson said that it has never been easier for overfitting to slip through the cracks. With the layers involved in deep learning, it’s difficult to point out when and where a model has become over-trained.

With high demand on data and analytics in the enterprise, there’s a time crunch on data science teams to deploy effective models. But as important as timeliness is, organizations have to implement processes and a culture that prioritizes machine learning effectiveness.

“You have to have very disciplined processes in your data science organization to go in and make sure that [you] are adhering to best practices,” Carlsson said.

When it comes to model deployment, organizations need to work with their data science team. If a model isn’t working or if the results aren’t as promising as previously thought, Carlsson said, this is where communication becomes important. The relationship between data scientists and their organization can help determine if the model is functioning poorly due to overfitting, or the end-user application, over-hyped projections or another non-data-related cause.

The cost of overfitting and how to avoid it

Getting ahead of overfitting is crucial for model deployment. An overfitted model, when deployed against real-world data, will provide either no useable gains or false insights. Both cost an enterprise significant amounts of time, resources and buy-in from executives; and this cost grows even greater if the overfitted models’ false insights go unnoticed and trusted.

As Gartner analyst Alex Linden put it, an overfitted model can render the insights from software or AI analysis useless. In the case of predictive maintenance, the system is not able to predict where or when the machine is going to fail. In predicting sales, the model will either be unable to predict whether somebody is going to react favorably to a selling proposition or will provide false positives.

Linden’s prescription for this is testing. Running the model through numerous trials with different data sets is the best way to prove its capabilities to generalize rather than memorize. For Linden it is the number one way of avoiding overfitting and there are machine learning technologies that can ease this process of trial and error.

“Mostly these days you can actually avoid overfitting by brute force, and the brute force capabilities of automated machine learning,” Linden said.

Instead of having your data science team members run these trials manually and deploying numerous models to find the correct one to use, automated machine learning can run hundreds of data sets in a short amount of time.

Another way to avoid overfitting models is building in a forgetting function, especially with deep neural networks. Having your data science teams encode a forget function allows for the model to organize, create data rules and create a space to ignore noise.

“They try to only memorize the generalities of things and try to forget the exact nature of the things,” Linden said.

Another way to prevent overfitting in machine and deep learning models is ensuring that you have a holdout set of data to test your model on. If your model can generalize well enough then it should do well against this test data.

Building a core knowledge of machine learning and AI

Training a model often and with variety coupled with formatting forgetting functions and separate test data sets are all effective measures against overfitting. On top of this, organizations need to ensure there is a basic level of competency about common machine learning model failures throughout the company.

This entails investing and improving your organizational machine learning and data literacy levels. Though data scientists and members of the data science team are the first line of defense against overfitting and model problems, organizations require even more oversight to ensure successful application of machine learning.

“You’ll need the line-of-business stakeholder to be aware of overfitting, aware of these pitfalls and be on the lookout for them,” Carlsson said.

Having another part of the enterprise team understand common problems with model applications and what to look for adds another layer of safety. While there are numerous courses about machine learning and deep learning, finding the right level of basic AI literacy among employees can be a challenge. Most of these are tailored towards data scientists and provide information that a business owner or team member won’t realistically need to know.

Targeting the correct training to the right business professionals can decrease the chances of overfitting and prevent poor applications, Carlsson said.

The post How to avoid overfitting in machine learning models appeared first on Artificial Intelligence.

Deep Learning Surrogate Models Outperform Simulators

aiuniverse — Tue, 14 Jul 2020 07:05:09 +0000

Source: rtinsights.com

New research from the Lawrence Livermore National Laboratory (LLNL) demonstrated deep learning surrogate models performing as well, and sometimes better, than more expensive simulators.

The researchers tested the surrogate model on complicated inertial confinement fusion problems and found it could accurately emulate scientific processes, while also reducing compute time from half an hour to a few seconds.

The new deep learning-driven model significantly outperformed current surrogate models, while adequately replicating the simulator results.

“The nice thing about doing all this is not only that it makes the analysis easier, because now you have a common space for all these modalities, but we also showed that doing it this way actually gives you better models, better analysis and objectively better results than with baseline approaches,” said LLNL computer scientist Rushil Anirudh.

Alongside time and cost savings, the researchers believe the surrogate model could lead to new scientific discoveries, because of its ability to handle large volumes of high-dimensional data.

“This tool is providing us with a fundamentally different way of connecting simulations to experiments,” said fellow LLNL computer scientist, Timo Bremer. “By building these deep learning models, it allows us to directly predict the full complexity of the simulation data.

“Using this common latent space to correlate all these different modalities is going to be extremely valuable, not just for this particular piece of science, but everything that tries to combine computational sciences with experimental sciences. This is something that could potentially lead to new insights in a way that’s just unfeasible right now,” he added.

The post Deep Learning Surrogate Models Outperform Simulators appeared first on Artificial Intelligence.

The Importance of Image Resolution in Building Deep Learning Models for Medical Imaging

aiuniverse — Fri, 24 Jan 2020 08:11:29 +0000

Source: pubs.rsna.org

Deep learning with convolutional neural networks (CNNs) has shown tremendous success in classifying images, as we have seen with the ImageNet competition (1), which consists of millions of everyday color images, such as animals, vehicles, and natural objects. For example, recent artificial intelligence (AI) systems have achieved a top-five accuracy (correct answer within the top five predictions) of greater than 96% on the ImageNet competition (2). To achieve such, computer vision scientists have generally found that deeper networks perform better, and as a result, modern AI architectures frequently have greater than 100 layers (2).

Because of the sheer size of such networks, which contain millions of parameters, most AI solutions use significantly downsampled images. For example, the famous AlexNet CNN that won ImageNet in 2012 used an input size of 227 × 227 pixels (1), which is a fraction of the native resolution of images taken by cameras and smartphones (usually greater than 2000 pixels in each dimension). Lower-resolution images are used for a variety of reasons. First, smaller images are easier to distribute across the Web, as ImageNet in itself is approximately 150 GB of data. Second, the task of identifying common objects such as planes or cars can be readily discerned at lower resolutions. Third, downsampled images make it easier and much faster to train deep neural networks. Finally, using lower-resolution images may lead to increased generalizability or less overfitting of deep learning models that focus on important high-level features.

Given the success of deep learning in general image classification, many researchers have applied the same techniques used in the ImageNet competitions to medical imaging (3). With chest radiographs, for example, researchers have downsampled the input images to about 256 pixels in each dimension from original images with more than 2000 pixels in each dimension. Nevertheless, relatively high accuracy has been reported for detection on chest radiographs of some conditions, including tuberculosis, pleural effusion, atelectasis, and pneumonia (4,5).

However, subtle radiologic findings, such as pulmonary nodules, hairline fractures, or small pneumothoraces, are less likely to be visible at lower resolutions. As such, the optimal resolution for detecting such abnormalities using CNNs is an important research question. For example, in the 2017 Radiological Society of North America competition for determining bone age on skeletal radiographs (6), many competitors used an input size of 512 pixels or greater. For the DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge of classifying screening mammograms, resolutions of up to 1700 × 2100 pixels were used in top solutions (7). Recently, for the Society of Imaging Informatics in Medicine and American College of Radiology Pneumothorax Challenge (8), many top entries used an input size of up to 1024 × 1024 pixels.

In their article, “The Effect of Image Resolution on Deep Learning in Radiography,” Sabottke and Spieler (9) address that important question using the public ChestX-ray14 dataset from the National Institutes of Health, which consists of more than 100 000 chest radiographs stored as 8-bit gray-scale images at a resolution of 1024 × 1024 pixels (10). These radiographs have been labeled with 14 conditions including normal, lung nodule, pneumothorax, emphysema, and cardiomegaly (10). The authors used two popular deep CNNs, ResNet 34 and DenseNet 121, and analyzed their models’ efficacy to classify radiographs at image resolutions ranging from 32 × 32 pixels to 600 × 600 pixels.

The authors found that the performance of most models tended to plateau at resolutions of around 256 × 256 pixels and 320 × 320 pixels. However, classification of emphysema and lung nodules performed better at 512 × 512 pixels and 448 × 448 pixels, respectively, than at lower resolutions. Emphysema findings can be subtle in mild cases, manifested by faint lucencies, which probably explains the need for higher resolution. Similarly, small lung nodules may be “blurred out” and not visible at lower resolution, which can explain the improvement in classification performance at higher resolutions.

The authors’ work is important. As we move further in the application of AI in medical imaging, we should be more cognizant of the potential impact of image resolution on the performance of AI models, whether for segmentation, classification, or another task. Moreover, groups who create public datasets to advance machine learning in medical imaging should consider releasing the images at full or near-full resolution. This would allow researchers to further understand the impact of image resolution and could lead to more robust models that better translate into clinical practice.

The post The Importance of Image Resolution in Building Deep Learning Models for Medical Imaging appeared first on Artificial Intelligence.

Cortex Labs helps data scientists deploy machine learning models in the cloud

aiuniverse — Fri, 24 Jan 2020 07:29:29 +0000

Source: techcrunch.com

It’s one thing to develop a working machine learning model, it’s another to put it to work in an application. Cortex Labs is an early-stage startup with some open-source tooling designed to help data scientists take that last step.

The company’s founders were students at Berkeley when they observed that one of the problems around creating machine learning models was finding a way to deploy them. While there was a lot of open-source tooling available, data scientists are not experts in infrastructure.

CEO Omer Spillinger says that infrastructure was something the four members of the founding team — himself, CTO David Eliahu, head of engineering Vishal Bollu and head of growth Caleb Kaiser — understood well.

What the four founders did was take a set of open-source tools and combine them with AWS services to provide a way to deploy models more easily. “We take open-source tools like TensorFlow, Kubernetes and Docker and we combine them with AWS services like CloudWatch, EKS (Amazon’s flavor of Kubernetes) and S3 to basically give one API for developers to deploy their models,” Spillinger explained.

He says that a data scientist starts by uploading an exported model file to S3 cloud storage. “Then we pull it, containerize it and deploy it on Kubernetes behind the scenes. We automatically scale the workload, and let you switch to GPUs if it’s compute intensive. We stream logs and expose [the model] to the web. We help you manage security around that, stuff like that,” he said.

While he acknowledges this is not unlike Amazon SageMaker, the company’s long-term goal is to support all of the major cloud platforms. SageMaker, of course, only works on the Amazon cloud, while Cortex will eventually work on any cloud. In fact, Spillinger says the biggest feature request they’ve gotten to this point is to support Google Cloud. He says that and support for Microsoft Azure are on the road map.

The Cortex founders have been keeping their head above water while they wait for a commercial product with the help of an $888,888 seed round from Engineering Capital in 2018. If you’re wondering about that oddly specific number, it’s partly an inside joke — Spillinger’s birthday is August 8th — and partly a number arrived at to make the valuation work, he said.

For now, the company is offering the open-source tools, and building a community of developers and data scientists. Eventually, it wants to monetize by building a cloud service for companies that don’t want to manage clusters — but that is down the road, Spillinger said.

The post Cortex Labs helps data scientists deploy machine learning models in the cloud appeared first on Artificial Intelligence.

Data Mining Tool Could Help Train Machine Learning Models

aiuniverse — Thu, 26 Dec 2019 07:10:24 +0000

Source: healthitanalytics.com

December 24, 2019 – Researchers at Purdue University have created a new framework for mining data to train machine learning models used in drug development.

Using machine learning for drug development requires researchers to create a process for the computer to extract needed information from a pool of data points. Drug scientists have to pull biological data and train the software to understand how a typical human body will interact with the combinations used to form medications.

Drug discovery researchers developed the Lemon framework, which helps drug researchers better mine the Protein Data Base (PDB), a comprehensive resource with more than 140,000 biomolecular structures and with new ones released every week.

“PDB is an essential tool for the drug discovery community,” said Gaurav Chopra, an assistant professor of analytical and physical chemistry in Purdue’s College of Science who works with other researchers in the Purdue Institute for Drug Discovery and led the team that created Lemon.

“The problem is that it can take an enormous amount of time to sort through all the accumulated data. Machine learning can help, but you still need a strong framework from which the computer can quickly analyze data to help in the creation of safe and effective drugs.”

The Lemon software platform mines the PDB within minutes. The platform also allows users to write custom functions, include it as part of their software suite, and develop custom functions in a standard manner to generate unique benchmarking datasets for the entire scientific community.

The Lemon platform was originally designed to create benchmarking sets for drug design software and identify the lemons, or the biomolecular interactions that can’t be modeled well, in the PDB.

“Experimental structures deposited in PDB have resulted in several advances for structural and computational biology scientific and education communities that help advance drug development and other areas,” said Jonathan Fine, a PhD student in chemistry who worked with Chopra to develop the platform.

“We created Lemon as a one-stop-shop to quickly mine the entire data bank and pull out the useful biological information that is key for developing drugs.”

As machine learning becomes more integral to the healthcare industry, researchers have attempted to improve the accuracy and efficiency of these algorithms. Recently, a team at Penn Medicine discovered a once-hidden, through-line between two widely used predictive models that could increase the accuracy of machine learning tools. The discovery could expand the use of machine learning throughout healthcare and other industries.

“The expansion of machine learning to high-stakes application domains such as medicine, finance, and criminal justice, where making informed decisions requires clear understanding of the model, has increased the interest in interpretable machine learning,” researchers wrote in a study published in Proceedings in the National Journal of Sciences (PNAS).

“Despite recent efforts to formalize the concept of interpretability in machine learning, there is considerable disagreement on what such a concept means and how to measure it.”

Additionally, MIT researchers recently developed an automated system that can gather more data from images used to train machine learning models, including algorithms that can analyze brain scans to help treat and diagnose neurological conditions.

“We’re hoping this will make image segmentation more accessible in realistic situations where you don’t have a lot of training data,” said first author Amy Zhao, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and Computer Science and Artificial Intelligence Laboratory (CSAIL).

“In our approach, you can learn to mimic the variations in unlabeled scans to intelligently synthesize a large dataset to train your network.”

The post Data Mining Tool Could Help Train Machine Learning Models appeared first on Artificial Intelligence.