What Is A Model And Why Does It Matter?

Curve for aesthetics

The models, the MODELS. Models are everywhere. You’re probably tired of hearing about models and are now tired of reading the word “model”. I will attempt to explain why you should care about them, but even if you still don’t care you will sound smarter afterwards.

What is a Model?

A model is an attempt at replicating a thing or phenomenon. We make models of trains, cars, airplanes, which are smaller, physical replications of something much bigger. We know what something looks like at full scale, but a model is a smaller representation of that bigger thing. It allows us to see details or “big pictures” that we might not be able to otherwise see. For example, if you were standing on the dock next to the Titanic before it launched, you’d see a giant hunk of metal. You wouldn’t see the grand staircase, the engine room, and all the people. However, if we made a model of the Titanic (you know the kind where you can take the top off), you’d be able to see more than before! It would be a simplified version of a much bigger and more complex thing.

In the case of a mathematical model, we are trying to take something that seems complex and explain it in a simple way with math. A great example is one you learned in grade school, the equation of a line. Bear with me, I know a line is simple, but it illustrates a point (no pun intended). You can draw a straight line on a piece of paper, but what if you needed to replicate the EXACT same line somewhere else? Freehand you couldn’t do it with perfect precision, but with math you could. You need a way to model that line, so that you can use the model to replicate the same line over and over again, exactly the same way. The model just so happens to be: \(y = ax + b\). We change the \(x\) around to find the correct value for \(y\), and then we can draw our line.

Just a side note, a machine learning model is basically a fancy type of mathematical model. Most technical degree students take a class called Differential Equations (you’ll also hear it called “diff EQ”). It’s sometimes known as the introductory class to mathematical modeling. You learn things like modeling the flow of salt and water through a tank, or finding the temperature of a pizza in the oven at a given time. Machine learning is a fascinating application of mathematical modeling, where algorithms build a model to predict certain events or numbers. You will also hear the term algorithm and model used interchangeably in some settings.

It’s important to realize that a model is not perfect. It is a replication, not an exact clone. Going back to our Titanic example, if there was a tiny patch of scratched paint in the real hull of the ship, that visibility would be gone in a smaller model of the ship. You wouldn’t be able to see the chipped paint, or maybe some of the decorations, or people’s faces. You’ve stripped away quite a few layers of detail in trying to simplify it. But that is OK, because in the real world there is always randomness or uncertainty involved in building models. If there is no randomness, you should be concerned. That’s another discussion entirely, so just know that a good model is never perfect, and that’s exactly what we want.

Why does it matter?

No one builds a model for no reason. Well, I guess someone could, but I don’t know anyone that would. You build models because you want something. If you’re building a model train, you want a super cool train that you can show your friends and say “I built that!” You could even sell it for a profit once you build it. There is a return of joy and/or money when you’re finished building the model.

If you’re building a mathematical model, you want to learn something; you want to predict an event, number, replicate something, or simplify a complex system.

Math is great at all of these things.

Let’s go back to the line example. Suppose someone needs a line drawn, but they don’t know exactly what it should look like. All they know is that it needs to point in a certain direction and be around a certain length. But you have a model of a line, remember? So you can take the approximate numbers they have and put them into your math model (another word for equation) and produce the perfect line. Easy!

Let’s look at weather models (yes, these are mathematical models). Humans have collected mountains of data about the weather for the past few hundred years. But nobody really cares about weather in the past, we want to know about weather in the future! How can we predict the future? Let’s say it together: models.

Think about what we’re trying to do. Weather is a complicated system of atoms and molecules, and we just want to try and simplify it to something we can explain and use. We can use the massive amount of data about different variables such as temperature, humidity, barometric pressure, etc. to make a model, or replication, of the weather at a given point in time. Let’s say tomorrow is June 1st. We can use the temperature from last year on June 1st and assume that it must be close, if not identical. So the model then becomes: \(temp_{tomorrow} = temp_{thisDayLastYear}\). Now the more variables we use, the more complex the model becomes. In reality, weather models are some of the most complex models in the world and require huge super computers to crunch all their numbers.

The Real Importance of Modeling

Models drive so much of our daily lives. When you check the weather, get new suggested videos on YouTube, and listen to reports about the coronavirus, you are seeing the effects of modeling. Modeling is all around us, yet so many people think of it as a totally abstract concept. If you are in a technical field then I believe you are already using, are planning to use, or will use some type of model in your job setting.

A little while ago I attended a talk about the future of work in the U.S. The major takeaway from this talk was that if you can’t adapt or think ahead in your role, you will probably become obsolete at some point in your career. Every company is trying to innovate and become better than their competition. At the end of the day, you can’t run a business based on how people “feel,” you have to run it based on revenue and increasing profit. If you cannot contribute to a growing company then you might not be there for many more years. I mention this because these days, there is still a prevailing ideology that robots and computers will take over the job market. This is wildly inaccurate, but there is also a grain of truth there to consider.

Consider this: a computer-based machine learning model doing data mining is much more cost effective than an analyst who does the same job in twice the time. A revenue-driven company is going to make this trade if they are not getting the results they need. In order to implement this model, though, they will have to hire another engineer or better trained analyst. Now consider an alternative solution: what if the current analyst learns how to build some machine learning models? That is so much more cost effective for the company. The company will not spend the time and money of onboarding a new hire, and the current analyst already knows what the company needs. This is a win-win for everyone.

Now this is not meant to be a career-building article so I won’t elaborate on that aspect. My point is that I have heard all kinds of stories like this from many different career paths. If you can understand and utilize the tools that are out there for even simple modeling, you could be in a great position to add value in your current role.

This is only one application where knowledge of modeling is helpful and applicable, there are countless others.

How do I get started?

I’m glad you asked. My favorite place for learning about new data topics is at DataCamp. The teachers and courses are fantastic, and if you use this link you and I both get a discount. I recommend checking out the Machine Learning for Everyone career track, it has a great introduction to the field machine learning and modeling.

If you want to get your feet wet without paying anything, I’ve listed some helpful links below with more information. I got my start in coding (which led to my fascination with data) using Codecademy. It’s free and has interesting classes on loads of topics, and I would recommend taking the Learn R class for an introduction to statistical programming, which is commonly used to build models. If you go into any depth in modeling you will have to use a language like R or Python to build your models and algorithms.

If you have any questions or want more information, I’d love to hear from you!

Image of Lukas Coffey

by Lukas Coffey

Data Engineering Intern at Capital One

Get in Touch

I absolutely love hearing from people interested in data! Reach me anytime on one of the mediums below, I'll respond as soon as I can.