Drowning in data: NKU creates new data science major

Every minute, trillions of bytes of data are being created. From passive online browsing to checking out at the grocery store, the volume and variety of data being collected every day would seem inconceivable not long ago.

With more information generated, stored and accessible than ever before, this unprecedented data collection is now shaping the way universities prepare students for the future job market.

Along with this trend and with the support of local corporations, such as Toyota and Procter & Gamble, NKU launched the first undergraduate data science program in the Midwest this semester in the College of Informatics.

The degree will arm students with the skills to take on the growing demand for the ability to successfully analyze, explain and advise how to use big data. From government and business, to science and health care, data science is part of many sectors of society and captured the interest of various organizations, according to Kevin Kirby, dean of the College of Informatics and a contributor to the program’s proposal.

“Because it is both interesting and scary, it makes for a very exciting program,” Kirby said. “It is information across everything…It really captures the essence of informatics.”

This “premiere” or “ultimate Informatics” degree takes an interdisciplinary approach to meet the growing interest of corporations with use for data scientists. The program ties together the computer science, business informatics and communication departments in addition to the mathematics and statistics department to develop the skills to analyze big data and use it in innovative ways.

“Data science becomes a model for how other people may want to incorporate things they never thought of putting together that create these avenues that are needed within the metro region for employment and the needs of the 21st century job market,” said James McGuffee, chair of the computer science department. “It’s a model for how we go about the future, developing new programs that we need in the area.”

McGuffee is teaching the major’s first course, Introduction to Data Science, this semester. The course has 15 students enrolled and will familiarize the students with the subject and program outcomes through reading books, such as Kenneth Cukier’s book “Big Data: A Revolution That Will Transform How We Live, Work, and Think,” to guest speakers from companies that work with data science.

Data science was described as “the sexiest job in the 21st century” by the Harvard Business Review in 2012.

All over the world, people with the ability to effectively interpret and communicate the potential of the vast amount of data constantly being created and recorded are in demand. The McKinsey Global Institute predicts that in 2018, there might be a shortage of 140,000 to 190,000 workers without the skills to effectively use big data in the U.S.

“Society has changed to the point where we are generating so much data, and there is such interest in analyzing it because it is coming at you so fast and there is so much of it,” Kirby said. “There is this thought that we haven’t fully realized how to use it yet, so companies are starved for people who know how to do this.”


NKU’s interdisciplinary program

The university’s launch of an undergraduate data science program is a “game changer” and “really big deal”, according to Cukier, data editor for The Economist and author of a book utilized in the first data science course.

“Most people wouldn’t think of Kentucky as being the epicenter of big data,” Cukier said.

He said it is “essential” that universities prepare to train students in data science and big data because the more NKU explores and develops this program, the more competitive the students will be as employees, and Kentucky will be as a state.

Students going to a school with a cross disciplinary program will “have a leg up in the workforce, bring more value to their employers and have a more satisfying career because they are well-trained,” according to McGuffee.

“We don’t envision it will ever be the largest per-student degree, but it’s going to be an extraordinarily prestigious. If not one of the most prestigious degrees at this university, if not in the entire commonwealth,” McGuffee said.

To create a well-rounded program and data scientists in the end, the curriculum combines component mathematics and statistics skills, programming and machine learning, and specific subject matter expertise. The students are also studying the ethics of the field.

Students are required to crunch the data by finding and exploring; to make sense of the information with machine learning and statistics; and explain the findings to others to make decisions. They are trained to use new technology and tools to decipher trillions of data points.

The degree “emphasizes the critical arc that runs from data to information, information to knowledge, and knowledge to decision making,” according to the program’s website. Throughout the program, the curriculum works on building skills spanning the foundations of data science, statistical modeling, data mining, business analytics and scientific visualization.

A principal part of the program is the capstone project, which requires students to collect, explore, communicate and interpret a big data set.

The program will benefit the university through spin-off opportunities, such as collaborative student research projects and possibly a data science minor in the future, according to McGuffee.

“I actually don’t think you can underestimate the impact this major is having on this university,” McGuffee said.


Meet one of the first data science majors

The data science program attracts students driven by curiosity, according to Kirby. He refers to these type of students as “detectives.” Students have to sort through trillions of data bytes coming at them fast and find a pattern or meaning in the information.

The program seeks student detectives with a dedication to using new tools to discover hidden patterns buried in vast amount of data and communicating this information in compelling and effective ways, according to Kirby.

Freshman Nathaniel Hudson is one of these curious students and one of the first data science majors. He describes himself as a “big nerd when it comes to computers.” He found the field after attending a program with Google over the summer.

“Data science is one [field] our lecturer pushed more than anything else,” Hudson said. “It was kind of cool because I was sitting among the other people that got in. There were people going to Stanford, Princeton, Harvard, Carnegie Mellon and the like, and I was the only one that could say my school has a data science program.”

Not everyone grasps how much data is being generated and stored by various corporations every day. In the United States, 15 out of 17 industry sectors have more data stored per company than the U.S. Library of Congress, according to the McKinsey Global Institute.

However, data science majors begin to register its true volume and potential impact early on.

“The one thing that will be intimidating, but its just because I don’t have the experience at the moment, is the idea of how big the data is,” Hudson said. “By the end of the major you are supposed to be able to take on I think petabytes [about 1,000 terabytes] of information.”

The program will train students in everything from utilizing the information discovered in these enormous data sets to help businesses make decisions to building search engines.

In the future, Hudson plans to work for one of the first major companies to analyze big data to make decisions: Google.