Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Get Started Now!

Chi Explores Essence of Big Data

Source: cmu.edu

Whether you noticed or not, you are receiving and creating countless data in your everyday life, sometimes merely by sending messages and browsing items on a shopping site. Many fields, such as medicine and entertainment are data-rich, which drives researchers to find new ways to capture and analyze this rapidly increasing information.

Carnegie Mellon University’s Yuejie Chi is one of these researchers.

“There’re lots of interesting questions about how you can model such data and how you can extract information from these data,” said Chi, an associate professor of electrical and computer engineering. “They allow me to apply the type of tools I know to some practical problems that domain experts might be interested in.”

For her research, Chi earned a Presidential Early Career Award for Scientists and Engineers (PECASE). Established in 1996, the PECASE is the highest honor bestowed by the United States Government to outstanding scientists and engineers who have begun their independent research careers and have shown exceptional promise for advancing their fields.

Chi’s research focuses on representing data efficiently to reduce complexity and improve decision making.

“We can obtain plenty of information from big data, but the data we observe and collect every day can be highly redundant, messy, and incomplete,” Chi said. “Take movie sites such as Netflix as an example; the users may only review a small number of films even though there are thousands of films out there.”

How, then, can people extract useful information from these raw data? Though overwhelming at first glance, the entries in big data matrices can be correlated. There may be millions of users in a movie site, but they have many similarities such as age, country of origin and educational background. Likewise, movies can have the same genres, directors and lead actors. By studying entries by their correlations, researchers can obtain their hidden features. By focusing on these latent variables, movie sites can predict the missing entries and what movies the users might like. In this way, they can design algorithms to build an effective recommendation system.

“You don’t directly just think about the data itself; you’re trying to get some structures,” Chi said. “Once you get a good model of the latent structure, you can think about solving an inverse problem where you try to recover those latent structures using optimization. So we’re studying how to design algorithms to recover these structures.”

Aside from recommendation systems, Chi also uses latent representations to examine problems associated with imaging modalities. Biologists build devices, such as single-molecule super-resolution microscopy, to look at structures within cells, but the images they collect often lack the desirable resolution due to limitations of the device. By studying latent structures, Chi’s team has developed a new algorithm that significantly enhances the image resolutions; it uses the same available data but fewer computational resources.

Recently, Chi has been developing algorithms for distributed optimization. Nowadays, people often distribute data to different machines, as the data sets are too massive to fit onto a single device. Once they establish a distributed setting, however, communication issues may arise among individual machines. There may be adversarial events, and some entities may not want to share data with the central location for privacy reasons. Thus, Chi aims to design algorithms that are communication-efficient and resilient to outlier events.

“Once you know how to represent your data, you can leverage the structures in your algorithm design and achieve the goal more efficiently,” Chi said.

Related Posts

What is Data Ethics and what are the Types of Data Ethics Tools?

What is Data Ethics? Data ethics is a branch of ethics that focuses on the responsible collection, use, and dissemination of data. With the rapid advancement of Read More

Read More

What is High-Performance Computing Clusters and what are the Components of HPC Clusters

Introduction to High-Performance Computing Clusters High-Performance Computing (HPC) clusters are crucial for organizations that need to process and analyze vast amounts of data in a short period. Read More

Read More

What is Cloud Computing and what are the Features and Benefits of Cloud Computing Platforms?

Introduction to Cloud Computing Platforms When we talk about cloud computing, we often refer to the various platforms that allow us to store, manage, and access data Read More

Read More

What is Big Data Processing and what are the Types of Big Data Processing Tools ?

What is Big Data Processing? Big data refers to extremely large data sets that cannot be processed by traditional computing methods. Big data processing involves various techniques Read More

Read More

Big Data Role in Decision making in addressing organizational problems

Source – https://www.techiexpert.com/ Enterprises and organizations always work to improve and mitigate how they respond to challenges and make their businesses agile at the center of every Read More

Read More

What Is The Definition Of Big Data?

Source – https://timesnewsexpress.com/ Did you realize that a fly motor can produce more than ten terabytes of data for only 30 minutes of flight time? What’s more, Read More

Read More
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x