Implementing Algorithms for Data Analytics
In an age where we are buried in information, how do we make sense of it?
 Graduate student in computer science, Wellington Cabrera (Ph.D., ’17) worked on implementing
                                 algorithms for data mining and analytics.One of technology’s many advantages is the ease with which we can collect information
                              on everything from internet browsing patterns to activity levels recorded by wearable
                              trackers to weather patterns, all of which allow for more accurate insights. This
                              ease of data collection, however, has its own drawback, as the sheer amount can be
                              overwhelming.
Graduate student in computer science, Wellington Cabrera (Ph.D., ’17) worked on implementing
                                 algorithms for data mining and analytics.One of technology’s many advantages is the ease with which we can collect information
                              on everything from internet browsing patterns to activity levels recorded by wearable
                              trackers to weather patterns, all of which allow for more accurate insights. This
                              ease of data collection, however, has its own drawback, as the sheer amount can be
                              overwhelming.
Data Mining: Teasing Out Hidden Patterns
While working on graduate degree in computer science, Wellington Cabrera (Ph.D., ’17), sought to address this problem by creating and implementing algorithms for data analytics and mining in database systems. This research was performed under the guidance of Carlos Ordonez, associate professor of computer science in the College of Natural Sciences and Mathematics.
In this deluge of information, with all of its tangled implications, data mining works to tease out the hidden patterns.
“Another name for data mining is ‘knowledge discovery,’” Cabrera noted.
Parallel Computing Increases Speed and Storage Capabilities
Cabrera’s research was to develop algorithms that could work for parallel database systems. Often, database systems are distributed across multiple computers, a strategy termed parallel computing. Although this increases a database’s speed and storage capabilities, this also requires an adjustment in how tasks are performed.
“Developing an algorithm for a single computer requires a lot of sequential steps,” Cabrera said. “For parallel systems, the challenge is getting these multiple computers to work together to solve problems.”
Scalable Algorithms
Cabrera focused on scalable algorithms, in order to get comparable performance regardless of a database’s size.
“You want an algorithm that can work just as well with two computers as it does with 1,000,” Cabrera said. “When you have many computers working together, you tend to see a degradation. If the algorithm does not coordinate the parallel processing correctly, then the computers cannot work together in the right manner, becoming a mess.”
Overcoming Challenges
During his time as a graduate student, Cabrera faced many of the typical challenges
                              of juggling coursework, research and his responsibilities as a teaching assistant,
                              all while trying to plan for the next 
step in his career.
“To get a Ph.D., you have to overcome many obstacles,” Cabrera said.
This hard work ultimately paid off, as Cabrera landed several internships, published numerous papers in well-respected journals, and, after graduation, was offered a job in the tech industry.
“I am very stubborn,” Cabrera said. “I don’t like to give up.”
- Rachel Fairbank, College of Natural Sciences and Mathematics