John Herbert === [00:00:00] John Herbert: Not all science is going to take us to the moon, and that's okay. And the sort of premise is that I would like us to move away from this sort of hero worship culture that sometimes develops around individual scientists. Jen Farmer: From the heart of the Ohio State University on the Oval, this is Voices of Excellence from the College of Arts and Sciences, with your host David Staley. Voices focuses on the innovative work of Arts and Sciences faculty and staff. With departments as wide ranging as art, astronomy, chemistry and biochemistry, physics, emergent materials and mathematics, and languages, among many others, the college always has something exciting happening. Join us to find out what's new, now. David Staley: I am joined today in the ASC Marketing and Communication Studio by John Herbert, Professor in the Department of Chemistry and Biochemistry, the [00:01:00] Ohio State University College of the Arts and Sciences. Dr. Herbert, welcome to Voices. John Herbert: Thank you. David Staley: I'd like to begin first with your work in computational molecular quantum mechanics, or we sometimes maybe call it quantum chemistry. Maybe let's start with a definition of what we're talking about. John Herbert: Right. So, quantum chemistry, if you don't want it to be such a mouthful, basically the application of quantum mechanics by people who are trained in chemistry, trained as chemists, or in other words, the application of quantum mechanics to problems in chemical science, problems in material science. So, quantum mechanics, you know, if some of my physics colleagues will forgive me, is for the molecular and material sciences is sort of pretty close to being a theory of everything. You know, that's certainly not true, if you want to go down to the subatomic level, there's some relativistic things that we need to graft on kind of ad hoc, but for a lot of chemical science, it's pretty close to a theory of everything and, and was written down in a form that we would still recognize today by Erwin Schrodinger. And modern computational quantum [00:02:00] mechanics, modern quantum chemistry really consists in devising algorithms and software, computer codes, to solve Erwin Schrodinger's equation for realistic chemical problems. So, that's what we do. David Staley: Remind us what Schrodinger's problem was. John Herbert: Oh, well, the original problem was the hydrogen atom, solved in the Swiss Alps over his Christmas vacation in 1925 with a woman who probably was not his wife, you know, et cetera, et cetera. It's a long story, but, so, you know, one, one thing that I was thinking, so for a general audience, you know, there, there's a lot of, obviously a lot of popular attention these days to the AI, the artificial intelligence revolution. In chemical sciences, AI often gets called machine learning, so I, I would use those terms basically synonymously. And so the AI revolution has certainly touched, my field as well. but the question is where do you get the data, right? So you, you may be [00:03:00] familiar, you may have read last year that Microsoft is paying to reactivate one of the decommissioned reactors at Three Mile Island. David Staley: Right. I think I did see that. Yeah. John Herbert: And that gives you some idea of just how power hungry data centers are, and that's not just machine learning, but it's a growing part of it, right? So, you know, part of that is, you know, storing all of your photos from Facebook for the last 30 years, but another part of it is training these AI models, training ChatGPT or Microsoft's knockoff of it. And so, that power requirement then gives you some indication of what the data requirement is to train one of these models, and, you know, ChatGPT gets trained on, you know, gathering up a lot of copyrighted text and feeding it into, feeding it into the model, you know, so-called large language models get trained on, you know, written language. In chemical sciences, we train [00:04:00] on data from computational quantum mechanics. David Staley: Hmm. John Herbert: Right. And, and so, you know, it's similarly incredibly expensive, and so what I tell my students and, and we, we dabble a little bit in AI in my group, but it's definitely not a main research thrust for me at this point, but I tell my students that this AI revolution, you know, doesn't obviate the things that we do; it actually makes them much more important. That there's a need for, you know, relatively low cost computational quantum mechanics methods to generate accurate training data, so that we can build models that are then orders of magnitude faster. So, and, and orders of magnitude faster means that we can scale them up both in time and space, right? Much longer timescales, you know, much larger model systems. David Staley: And when you say lower the cost, the cost of the electricity and these sorts of things. How do you accomplish that? How do you conduct this at lower cost? John Herbert: How do I put my entire research program onto a bumper sticker? Uh, [00:05:00] no. That which, you know, that's the goal, right, and, and I, I can't claim that I'm very good at it. But I mean, in short, it's application of applied mathematics to chemical problems, and so, you know, I take students that have backgrounds in chemistry, maybe a little bit of applied mathematics, maybe a little bit of computer programming, and try to turn them into software engineers, you know, PhD algorithm scientists in, you know, in five years or fewer. And you know, in that sense, I would say that what I do has been, you know, multidisciplinary for a long time. I mean, there's increasing push to call everything interdisciplinary, you know, I just want to be a physical chemist, but that's always been, you know, a, a little bit of chemistry, a little bit of physics, a little bit of applied math, a little bit of computer science and putting those all together. David Staley: And the idea is to be able to produce results that, say, AI could do, as well or [00:06:00] better? John Herbert: That's right. So, it's to generate training data. David Staley: Mm-hmm. John Herbert: So, so basically, machine learning models, artificial intelligence, ChatGPT, whatever the Microsoft version of chat GPT is, that I'm spacing on, but all of those things are essentially interpolation models. So, in a sense, take a bunch of data points and, and for a large language model, those data points are, you know, the text of _Hamlet_; they take a bunch of data points and try to draw a smooth curve through those data points, right? And they do incredibly well if the answer to your question lies close to that curve, or in other words can be interpolated between known data, and they do spectacularly badly if you ask it a question that's far outside of the training data, or in other words if you ask it to extrapolate rather than interpolate and, sort of, the kicker is you don't necessarily know what you're getting. And, and now there is some work, you know, from within the AI community on trying to figure out when the model is extrapolating, not interpolating, and things like [00:07:00] consensus learning, where you train multiple models and, and you trust them as long as they're all giving you basically the same answer. And when they don't give you the same answer, that's when you know that you need more data, and that's where I come in, is generating those data points, through which Chat GPT or the chemical sciences version of chat GPT is going to, is going to very quickly draw the smooth curve, right? But, but getting the data is expensive. And so, you know, for example basically, computational quantum chemistry became a thing starting in the 1960s. Actually, my emeritus colleague, Russell Pitzer, did the first, or certainly one of the first polyatomic, quantum chemistry calculations, he calculated the barrier to internal rotation of the ethane molecule, which is C ₂H₆ , so a very small molecule. That was an important problem at the time though, it's one that actually his father had worked on experimentally 30 years earlier. He did that for his PhD thesis in [00:08:00] 1964, I believe. And so, then the next three decades of the 20th century, sort of characterized by coming up with very good algorithms for solving Schrödinger's equation, Erwin Schrödinger's equation for small molecules. You know, and that's then about the time that I came in, I got my PhD in 2004. In a sense, the small molecule problem was largely solved at that point, but, you know, we would like to scale these things up to, to larger and larger molecules. You know, my group does a lot of work these days on, on trying to compute high precision thermo chemistry for enzymatic reactions, so for proteins that have thousands of atoms. For material science applications, you want to scale up to, you know, to much larger models. And there, you have to confront the fact that the cost of computational quantum mechanics scales incredibly poorly with system size, scales non-linearly with system size. And so, to pick a particular method that we like for thermochemical problems, for computing energies of [00:09:00] reaction, scales is the seventh power of the number of atoms, we call that end of the seventh scaling. David Staley: Wow. John Herbert: And what that means, that standard explanation is if it's end of the seventh scaling, if you double the system size, then the cost goes up by two to the power seven, that's a factor of 128. I like to phrase it a little bit differently because if you, if you're really working at the sort of the bleeding edge, you probably can't double your system size, so a different way of thinking about that is that the seventh root of two, two to the power of one seven, is about 1.1, and what that means is if you have an end of the seven method and you're willing to double the compute time, you can increase the system size by a factor of 1.1. 10%. David Staley: Mm-hmm. John Herbert: So, a 10% increase in system size doubles the cost. And it's actually worse than that, because what actually is usually the limiting factor is not CPU cost, but is memory, and, and by the time you're up to 10 or [00:10:00] maybe 20 atoms, you're at terabyte storage requirements. And so we do, we do all of this work at the Ohio Supercomputer Center, and I should... David Staley: On West Campus. John Herbert: Yeah, yeah, exactly. Although the Supercomputer Center people would be quick to point out that OSC is a State of Ohio resource, not an Ohio State resource. David Staley: I stand corrected. John Herbert: And they're right. I mean, we have, we have incredibly good computing resources in the state of Ohio. David Staley: Mm-hmm. John Herbert: Could not do what I do without them. Even so, you know, I just said 10 atoms is a terabyte storage problem, ohio Supercomputer Center has thousands and thousands of compute nodes: they have on the order of 10 that have that much memory, right? And so, if we want to attack these problems with standard methods that existed at the time of my PhD, 20 years ago, you know, that's what we're limited to, right? David Staley: Mm-hmm. John Herbert: And, and so the thrust in my group has been how can we design better algorithms [00:11:00] that, you know, invoke specific approximations for specific types of problems. Maybe give up on having a theory of everything, but in return you get something that, you know, is a, maybe a theory of one thing, that you can apply to much larger systems. David Staley: How many chemists, physical chemists are thinking about cost in the same way that you are? Is this standard practice now, or are you, are you something of a pioneer in this? John Herbert: That's a loaded question. David Staley: Okay. John Herbert: So, I don't think that I'm a pioneer in thinking about cost, and actually I'll, I'll use that as a segue to something I wanted to talk about anyway, which I call proletariat computing. David Staley: Ah, okay. John Herbert: I used to call it blue collar computing, but it turns out that OSC has an old program that they used to call blue collar computing, so in the interest of not plagiarizing, let's call this computing for the proletariat. Um, so I mean, everyone who does quantum chemistry is concerned at some level about cost because these calculations are [00:12:00] expensive, right? And so you sort of can't not be, right, and you need to design your expectations in a realistic fashion. But, I think one area where people's thinking has been a little bit misguided is with regard to parallelization. And so there's been a lot of effort, a lot of pressure from the funding agencies to design software that can run effectively on hundreds of thousands of processors, and I understand the political game that those program managers are playing. Think about the Jaguar machine that came online 10 or 15 years ago at Oak Ridge National Lab, and for a short time was the fastest supercomputer in the world, there's faster computers now, I just can't remember their names; I think the main cluster there had 250,000 processors, and so there was a big push to find software that could run simultaneously on all quarter of a million processors, and the idea was because then we can go to Congress and say, look, we have software that scales to this entire machine, [00:13:00] so clearly we need you to authorize money for us to build larger and larger machines, right? The problem is that nothing scales with perfect efficiency, and in fact, some of the parallel efficiencies that they were willing to tolerate were incredibly low, you know, 10% parallel efficiencies, meaning double the number of processors, and you, you know, you eke out another 10% of the compute time. And then they'll tell you, well, you know, we can solve this very challenging problem in half an hour running on 250,000 processors, and I think that's disingenuous because the real cost of that calculation is not 30 minutes, it's 125,000 hours, right, because that's what the carbon footprint of that calculation looks like. David Staley: Mm-hmm. John Herbert: Right? And, and so what we have really focused on in my group, this is the proletariat part, is what can we do with workstation level computing resources, you know, with the computer that's on your desk, or, you know, maybe slightly more [00:14:00] realistically, a few of the machines, like what's on your desk strung together. So, a few nodes at Ohio Supercomputer Center, but not all 8,000 or so nodes that are on their Pitzer cluster, right? And one thing that I like about that, and I guess I also don't wanna bash too much on parallel computing because you know, that is the approach of last resort and, and there are sometimes some problems that in science that you want to tackle that are simply so large and so intractable that the only way to tackle them is to write a code that can scale to 250,000 processors, write a proposal so that you can get that much computer time and, and do it. But, for a lot of day-to-day stuff that's not very efficient, right, and so, we are trying to work on, you know, what can we do on tens or hundreds of processor cores. You know, just to put it in perspective, I have incredibly good computing resources here, as does anyone at a institution of higher learning in [00:15:00] state of Ohio through the Supercomputer Center. My partner who works at a PUI... David Staley: PUI? John Herbert: Primarily undergraduate institution. Was hired to start a computational science program and got startup funds to purchase a computer cluster, and that computer cluster at her institution was two 16 core workstations, so 32 processors, and so that's just in a different league. David Staley: Mm-hmm. John Herbert: From what is available to me, but it's interesting to think about, you know, can we design software and algorithms that can be useful to her and can be useful to people in developing countries where, you know, maybe the flagship research institution of your country has computing resources that are measured in hundreds, not hundreds of thousands of processors. David Staley: Hmm. You have an interest in what you've called the infrastructure of science or for science, and I'm particularly interested by something that you said before we started [00:16:00] recording. You said you're trying to change the culture of academic science, and I wonder if you'd say a little bit more about, what you mean by that. John Herbert: Well, yeah, and what I told you before we started recording is that, you know, the computational quantum mechanics feels like a big intractable problem that we can only chip away with in increments over many years with, you know, a lot of people working on it, and, and we have, right? My retired colleague Russ could do eight atoms at a very low level of theory in 1964, and, and now, you know, my group is doing thermochemical calculations on full proteins with thousands of atoms at an accuracy that's really useful to chemists. And so, you know, that's where we've come, over about 80 years that the field has existed. So, you know, why, why limit yourself to just one big intractable problem? So, the other big intractable problem is changing the culture of, of science and that means a couple of different things. Actually, I wrote an op-ed in [00:17:00] preparation for this interview that I will submit somewhere after this. I, I wanted to do this interview and see if I had any grand revelations or if I changed my mind about it. But, but basically, I think the working title is that not all science is going to take us to the moon, and that's okay. And the sort of premise is that I would like us to move away from this sort of hero worship culture that sometimes develops around individual scientists, and to recognize that, as much as I might enjoy the, the work that I do and, and I really do enjoy it, and I do think it's important to the field and to society in general, the fact that I am working on it is not that important. And, you know, I would say to academic scientists selected at random, you know, your work is probably not actually that important in the grand scheme of things. Um, [00:18:00] because my, my view of scientific discovery is that very, very few scientific discoveries are really made in vacuum. You know, we, the public has this perception of the outlier cases, so Albert Einstein working out special relativity in the patent office in Vienna. That's the canonical example, right? Or, you know, Robert Oppenheimer in the movie scribbling out nuclear fission on a blackboard, right? And, at least the first of those is a real example, the second one was at least... David Staley: Hollywood. John Herbert: Based on, based on something real. Um, so those sort of things happen; I don't think it's the norm. I think that the norm for scientific discovery is you have a lot of people, most of whom are low wage graduate students and postdocs whose names will be forgotten to history if indeed they're ever remembered in the first place. You have a large number of people that are incrementally pushing the frontier forward, until it gets to a point where numerous people are [00:19:00] poised to make a important discovery simultaneously, and somebody's gotta be first, right? And, and then, you know, subsequent disputes over priority, you know, often I think are artifacts of the fact that in many cases there were numerous people that were poised to make this simultaneously, and, and if you think about it in that way, then you realize that what's really important to advancing science and, and you know, when I say, my work is not that important and this person's work is maybe not that, that's not to say at all that, that, that I don't think the U.S. should be investing heavily in science, right, it's the opposite of that. In fact, it's the exact opposite of that. It's saying we need more investment in science because we need more people working on these big intractable problems and pushing the frontiers forward, right? A critically important part of academic science is workforce training and workforce development. David Staley: Hmm. John Herbert: And, and that's what I [00:20:00] think sometimes gets lost or forgotten or ignored when we focus too much on, you know, this great man approach, great person approach to science. Actually, I remember a, a life-changing experience that I had in college was reading Howard Zinn's "History of the American People". David Staley: Right._ A People's History of the United States_. John Herbert: Yes, yes, exactly. Sorry, it's been 20 something, 30 years. Yeah, people's history. David Staley: A notorious book in my field. John Herbert: So, exactly. So, so this was exactly his attempt to get around, I think I'm taking that phrase directly from him, the great man approach to teaching history and to do something, to do something different, to teach it as the people's history of the country. And so, so I guess maybe that's, you know, maybe that's where I'm going with this, and maybe it's just congealing in my mind that this is, you know, the people forward approach to thinking about how we do science and, and it means acknowledging the labor of those graduate students and postdocs. Tends to be that, that we heap [00:21:00] all of the accolades on the PI. David Staley: The principal investigator. John Herbert: Exactly. You know, the group leader takes all the credit and then in those, infrequent, but unfortunately, not entirely rare instances of scientific fraud, it's the trainees that get thrown under the bus. David Staley: Hmm. Right. It's very hierarchical, the way most, most labs are established is that, is that an unfair characterization? John Herbert: Most is a statistical statement about greater than 50% that I'm not prepared to defend, but, but too many. I, but I think you're exactly right, and I think one thing that has been lost in those cases is student advisor relationships, right, in the way that I think still exists in the humanities. David Staley: Mm-hmm. John Herbert: That, you know, you, you take PhD students and you actually mentor them, and not have your postdocs doing the mentoring. David Staley: Mm-hmm. John Herbert: But, so I mean, you know, I, I said at the outset that I think there's two aspects to this sort of personnel problem, and one of them is, just recognizing the importance of, of workforce training [00:22:00] and once you've come to terms with that, once you've decided that that's an important objective of academic science, then you know, immediately you have to ask, well then, you know, what's the reward structure to get people to do that because, you know. Scientists are not that different from, from other people in, in many respects, maybe not as different from the average person, as some scientists would like to, to think. David Staley: John Herbert, thank you. John Herbert: Thanks. Jen Farmer: Voices of Excellence is produced and recorded at the Ohio State University College of Arts and Sciences Marketing and Communications Studio. More information about the podcast and our guests can be found at go.osu.edu/voices. Produced by Doug Dangler. I'm Jen Farmer. [00:23:00]