Quoc Luong Huynh Finds the Art In Math

Quoc Luong Huynh, ’24 Statistics, and fellow student researcher Ayane Gomi, ‘26 Mechanical Engineering, at the CSU Student Research Competition in Humboldt, California. Photo courtesy of Cal Poly Humboldt.
For Quoc Luong Huynh, ’24 Statistics, math is an art. And not just an art — a creative art. “If you look at proofs or theorems like the Pythagorean theorem, a lot of them are really creative,” he says. “And I love that creativity.”
His passion for math, statistics and research spills over even during a brief conversation — it’s not difficult to see why he was selected, along with 16 other SJSU student researchers, to represent SJSU at the annual CSU Student Research Competition in Humboldt, California from April 24-26.
Huynh presented his role in the research of Cristina Tortora, associate professor of statistics, whose work with clustering and machine learning algorithms has the potential to help researchers map and analyze their data.
The project may sound dry or even remote to the average person, but the implications are potentially wide-ranging. After all, as anyone who tuts endlessly over their Apple Watch may know, statistics can govern our lives.
Tortora’s research project (and Huynh’s role in it) grew out of a collaborator’s request. Meredith Wallace, associate professor of psychiatry and biostatistics at the University of Pittsburgh, asked Tortora if she could extend her clustering methods to different types of data.
One of the struggles in research is analyzing data, and one of the specific struggles in social sciences research is converting surveys and other sorts of observations into hard data. Wallace, along with many others, was hoping for a solution. Tortora’s project (with assistance from Huynh) created just such a method, one that solved two problems: 1) including categorical data and 2) accounting for asymmetry.
Let’s tackle them one by one, shall we?
Including categorical data
When you think of data, you’re most likely thinking of hard numbers or continuous data — in Huynh’s examples, which mimic the sleep studies he and Tortora helped with, this would be Patient Y slept X hours per night, for example. Categorical data, on the other hand, is a type of data that includes less obvious numerical value: survey responses, observations, etc. Categorical data for a sleep study would be something like: Patient Y took a nap on the observed day of the study. Patient Y smokes. Patient Y explains how tired he or she feels.
If you’re a researcher studying sleep, this kind of categorical data is crucial, since X hours of sleep per night doesn’t tell the full story. The categorical data listed above would most likely affect how well Patient Y slept and account for anomalies. But it’s difficult in the current machine learning algorithms to include that data, and time-consuming for a researcher to translate that data in order to include it. Tortora’s work helps fix this issue.
As she explains, “This project can group patients in ways that help doctors provide more targeted and personalized treatments.”
Including asymmetric data
Most machine learning algorithms, as Huynh explains, are “based on very stringent assumptions of the data distribution,” which essentially means that they are looking at the mean of data, the circular spread of data and the bell curve. But of course, this doesn’t account for all data, and assumes data to be symmetrical, with clear highs and lows.
In Tortora’s method, the data can be asymmetrical or skewed. As Huynh says, “The data distribution is now much more flexible instead of being rigid and stringent.”
The implications of this are also deeply important. Huynh uses the example of income distribution. If you looked at income distribution in the United States, for example, probably most people would fall within a bell curve — there aren’t many people making over $200,000 a year, although they do exist. A rigid, stringent data analysis (or algorithm) might overlook these high earners in favor of the averages, but again, that isn’t telling the full story. And the full story of something like income distribution — or sleep research, or any number of pressing social science issues — is vitally important to tell.
A love of research
Huynh was initially recommended for work in Tortora’s lab because of his interest in and performance in statistics courses. He’s researched with her now for nearly two years and has continued post-graduation as Tortora prepares to publish a paper on the new algorithm (Tortora and Huynh will be co-lead authors).
Tortora has found Huynh to be very valuable to the project. “Quoc is a very hardworking and independent researcher,” she explains. “What stood out most was his problem-solving ability — whenever he encountered a challenge, he didn’t just bring up the issue; he came prepared with several possible solutions. He also contributed many thoughtful ideas for future developments of the project.”
Both she and Huynh are excited by the possibilities and practical applications of the research — once the paper is published, the algorithm, written in the coding language R, will be open source, available to all who wish to use it. Up to this point, Tortora has focused the algorithm on psychiatric research, but it also could have broader use. Tortora is already extending her research to another subfield, and Huynh is assisting with that project as well.
“The exceptional flexibility of this new model makes it suitable for a wide range of applications,” she adds. “While we apply it to a medical dataset, it can just as effectively be used in fields such as business, marketing and the social sciences. In all these areas, identifying meaningful subgroups within the data can enhance interpretability and help extract valuable insights.”
The research competition
And as Tortora and Huynh know, these insights are more powerful when they can be communicated to a larger audience.
Huynh, who sees himself as an “applied scientist,” saw the CSU Student Research competition as a chance to meet other like-minded people and practice this type of communication in real time.
“It was a super fun, interesting and educational trip,” he says. “Getting the opportunity to meet and socialize with so many people outside your field like this is very rare in the professional and academic world. We all asked each other about our projects and tried our best to understand the research. Also, SJSU’s cohort was amazing! We tried to go to each others’ presentation sessions to support our peers, and I appreciated that a lot.”
He adds, “I think it was a very good experience to help me crystallize and learn how to express [research] ideas to the general public without just dumping a whole bunch of math and very technical terms on everyone. To me, when you write [or present] a research paper, you want it to be approachable and understandable, not just a chance to brag about how good you are. So I think [the competition was] a great experience for me from many different perspectives.”
It’s all part of his overall enthusiasm for research. “I find it fulfilling to translate what I know in math to actual use in real life applications,” he explains. ”It’s the satisfaction of proving something, of solving those problems and having those proven solutions used in real life.”
Learn more about research, scholarship and creative activity at SJSU.