In his Twitter description, Josh mentions that he’s trying to turn production machine learning from an art to an engineering discipline. With eight scientific papers to his name, he’s clearly not fooling around about his goal. You can explore his research at Josh’s personal site. Among other roles in his CV, Josh spent three years at OpenAI, where he did his PhD. Now he provides a course that teaches engineers about production-ready deep learning.
Check out Josh’s course: Full Stack Deep Learning.
John is an AI mastermind. He’s been involved in the field for over twenty years, and he has more than a hundred research papers to his name. And that’s just scratching the surface of John’s extensive resume.
He’s a Doctor of Learning at Microsoft Research in New York, where he works on making it easy to apply machine learning to solve problems.
John is part of the International Conference on Machine Learning. He’s also involved in the Vowpal Wabbit interactive machine learning library.
Stylianos has been involved in data science and AI for more than ten years, building an impressive list of achievements. He’s a Doctor of Computer Science, with degrees in AI, Statistics, Psychology, and Economics. This wide scientific background enables him to provide top-notch education about the technologies of tomorrow. He uses his expertise to teach people, to solve difficult problems, and also to help companies improve their efficiency.
Stylianos creates a lot of educational content about data science, blockchain, and AI on his blog The Data Scientist. If you’re looking for personalized training in the same areas, check out Tesseract Academy.
Senior data scientist who also happens to be a chess master and coach, among his many other talents. Jakub has been working in data science for over five years, and he’s already worked on several fascinating projects with Poland’s leading AI solution providers. Now, he’s working on a lightweight experiment management tool, enabling data scientists to efficiently collect the results of experiments, and turn these results into an easy-to-share knowledge base.
Check out Neptune.ai to learn about the machine learning experiment management tool that Jakub is working on.
Tarek has been involved in software development for ten years. Before that, he spent some time as an information security consultant and a pre-sales manager.
Apart from his current job as senior data scientist at Ticketswap, Tarek blogs and writes books about machine learning. He also volunteers in Global Voice Online, and is a local ambassador of the Open Knowledge Foundation in Egypt.
Check out Tarek’s latest book: “Hands-On Machine Learning with scikit-learn and Scientific Python Toolkits.” For more info about Tarek’s work, his other books and research, check out his personal site: tarekamr.com.
Bartek is an experienced deep learning researcher who has led teams that developed multiple machine learning solutions.
Their accomplishments include building a deep conversational AI system in the Polish language from scratch and developing Payability Brain—a multi-modal neural net that combines multiple types of features.
Is Python the best language for machine learning? Do you foresee any major changes to the popular ML software stack?
Josh Tobin: Right now, yes. In ML, 90% of the ideas you try fail, so iteration speed is critical. Python allows you to iterate faster (in ML) than any other language. I see many changes to the ML software stack, particularly on the infrastructure side, and possibly on the framework side as well (keep an eye on Jax), but I don’t see Python being dethroned anytime soon.
John Langford: It depends. If you are writing algorithms that will be widely used, then the favored approach is more commonly C or C++ since that can achieve higher efficiency and essentially every language can bind to C/C++ compiled objects.
On the other hand, if you are doing machine learning, then Python is the default language, which I don’t see changing anytime soon as it’s a well-known, adaptable, cleanly readable, and easy-to-use language.
Stylianos Kampakis: Python is the number one choice, with R taking second place. I don’t think there are any other contestants. Some people like languages like Julia, but I think Python has established itself as the dominant player.
Jakub Czakon: I think Python will continue to be the most popular and there are reasons for it. As ML is moving from research to production, the need of having a common stack for various parts of the ML life cycle pushes people towards Python more than R and other alternatives. That said, the microservice architecture and containerization (Docker, Kubernetes) make you mostly language-agnostic. With that in mind, you should figure out which algorithm libraries you need, which language has them and use that for the task at hand. My go-to is Python but if you’re working on things that are closer to (bio) statistics like survival models, then R is likely a better choice.
When it comes to the software stack I think we will see more adoption of tools that help with managing and productionalization of ML modeling. Tools like Kubeflow or Streamlit, just to give you a few.
Tarek Amr: Python is indeed ML’s lingua franca. It is flexible, easy to read, and as a non-compiled language, it is suitable for quick iterations. It also became entrenched deeper into the field due to the presence of different ML toolings such as Scikit-Learn, TensorFlow and Pytorch. Plus, TensorFlow and Pytorch are not just tools—big tech, or FAANG (Facebook, Apple, Amazon, Netflix and Google), release pre-trained models in these libraries. Anyone who wants to use these models is also going to favor Python.
It’s hard to see Python going away anytime soon. I can only think of two reasons for Python’s popularity to slowly decline in the future: Edge Machine Learning and Performance. I can see the merits for the former, but not the latter. Developers building mobile apps may choose to offload the logic and run it on the mobile device. They can do so to save their servers’ cost and to make use of stronger processors shipped with mobile phones nowadays. Then, they may use Swift or other native languages used on the mobile OS. Clearly, for well defined tasks, the likes of Apple and Google are also releasing pre-trained models to be used on their mobile phones.
As for the performance argument, I don’t think this will affect Python’s popularity. Software engineers will continue to be more expensive than the processors they use, and thus, we will keep favoring Python due to its aforementioned benefits. Software engineers will find a way to speed up Python, and even implement the computationally expensive parts of their code in a more powerful language, yet this will be hidden under the hood as in the case of Numpy, Pandas, TensorFlow, Pytorch, etc. That’s why I can’t really see the likes of Go, Rust, and Julia competing with Python anytime soon.
Bartek Roszak: Python definitely is the best language for machine learning in terms of research and modeling. If we think about machine learning in broader terms, there are some other languages that are helpful to deliver ML solutions. For example, you can use C/C++ to deploy a model, JS to build ML system monitoring dashboards, and Scala to build data pipelines. However, Python is the only language where you can build everything that the system needs and you don’t need to start from scratch. Data scientists often aren’t professional programmers so they need simple language and a powerful community. Python offers both.
Python is indeed the main language for doing ML right now, with R coming up in second place—unless you’re writing algorithms that will be used by a lot of people, then C/C++ is favored for its efficiency and universality.
Python has many benefits that make it perfect for ML; it’s well-known, adaptable, cleanly readable, easy to use, and it allows you to iterate faster than any other language.
How quickly is the field of machine learning moving along?
Josh Tobin: Many fields of ML (e.g. language, generative models) are moving extremely quickly. Some of the fields that got a lot of people excited about ML in 2014–2015 seem to have stabilized a bit.
John Langford: The speed of a field is hard to quantify. Some press reports make it seem dramatic when it’s not. On the other hand, there is steady significant useful progress over time. One way to quantify this is via the Microsoft Personalizer service that I’ve been involved in.
When I was a graduate student 20 years ago, online learning was theoretically understood as possible, but not used and reinforcement learning was typically done on super-simplistic simulations with the two not really working together. Now, we have a form of online reinforcement learning which anyone can use.
Stylianos Kampakis: Very fast! 6 months in ML are like 6 years in other fields. It is very difficult to keep up with everything!
Jakub Czakon: In some ways too quickly, in others not so much. I think that the modeling part, network architectures, research, but also tooling is really changing every day. Many tools that I used starting off, like Theano, are no longer with us.
On the flip side, the business understanding among machine learning folks lags behind in my opinion. ML should ultimately fuel the product, improve processes in marketing or sales, do something for someone. It’s not about building a model and putting it in production. At the end of the day, there is someone somewhere who is supposed to get value from all the beautiful maths behind those models. I don’t feel that it’s understood well enough in the community. All the ML doesn’t matter if you are not solving the correct problem, in a way that your user/customer understands. We need to get better at this, but it’s not as shiny as a distributed training of a 1.5B parameter transformer model.
Tarek Amr: It is moving very quickly indeed. You blink once and suddenly new algorithms are created, and new models are trained and released for anyone to use. This is particularly true in the fields of image and text processing. Tasks in these fields are well-defined, which means that concepts such as transfer learning shine there. We all heard of Open AI’s GPT-2, and a few months later GPT-3 pushed the boundaries of what is possible and shook the entire internet in disbelief.
I can attribute the jumps in machine learning to the big tech companies (FAANG), and the biggest impact is seen in transfer learning. These models cost millions of dollars to train, so only big tech companies can afford it and are leading the field forward—instead of academia. Outside those well defined tasks, things are moving quickly enough, but not at the same pace. Companies working on specific problems such as fraud detection, process automation, and time series prediction may not have these specific models offered to them on a silver plate. Of course, the tooling for them to create the models they need are progressing and getting better, but in today’s machine learning world, the greater jumps come from the size of the data and bigger machines to train on this data. I like to say that the emphasis now is more on the machines instead of the learning.
The progress of machine learning in business is also slowed down by its surrounding ecosystem. Data engineering is not moving as quickly as it should. There aren’t many affordable solutions to store and process the data being created. Companies are capable of creating huge amounts of data, but usually not able to properly store this data or make use of it. Product managers also find it easier to imagine what software engineers can build, but what is possible via machine learning is not very clear to anyone outside the narrow field. These are two examples why companies nowadays aren’t able to get the full potential of their machine learning teams.
Bartek Roszak: It is moving extremely fast. I remember when we built a conversational AI system from scratch in the Polish language. I felt like every month some new potential game changer appeared in the field of NLP, speech recognition, and speech synthesis. We had to prototype something new every month to check if we can get better results with new technologies. Even now, there are a lot of promising papers in fields such as multi-task learning or neural nets optimizations that are published regularly.
The progress in machine learning is very fast, especially in areas like language, generative models, network architectures, or the tools used by ML specialists.
However, there are areas of ML that have stabilized, and aren’t progressing as quickly. One of the key areas that are lagging seems to be the understanding of how to generate business value with ML.
What are you working on, and what is the most burning problem to solve, or feature to create, that is currently on your mind?
Josh Tobin: I’m currently working on infrastructure to help data scientists make the leap from experimentation to production. In my opinion, the lack of tooling and methodology around production ML is the biggest thing holding back the real-world impact of the field.
John Langford: Progress is generally about expanding the scope of applicability of machine learning. There are many questions here, but one of the most interesting to me is algorithms which directly learn the causal structure of the world (as per Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning).
If we can fully develop that ability, it will enable very new functionalities—imagine a robot that learns to use its actuators directly, for example.
Stylianos Kampakis: I am working on many different things, including a new book that discusses the history of machine learning! Also, I am working on an augmented analytics product called Datalyst. I believe that the future of machine learning lies in AutoML and augmented analytics, and I am trying to push things towards that direction.
Jakub Czakon: We’ve built a tool, Neptune, that helps machine learning folks keep their experimentation organized. Talking to many ML practitioners and researchers, I got to the conclusion that it strongly depends—whether you’re on a team that has models in production, or doing research, or you’re part of a consultancy that builds POCs for clients to see if putting ML in there makes sense.
I think the most burning need is conditioned on the team you are working on. It can be managing experimentation, building demos quickly, monitoring production models, or efficiently deploying models on edge devices or mobile phones.
Tarek Amr: I work in a second-hand ticketing marketplace. The company’s mission is to be a safe, convenient, and fair place to buy and sell e-tickets for concerts, festivals, sports events, theater, and day trips. This means that my team and I work on mitigating fraud and building recommendation algorithms to personalize user experience on our platform. We also work on scaling our business up by automating daunting tasks and building models to predict the future. We build models to predict supply, demand and customer lifetime value, to help our colleagues help our users better.
Bartek Roszak: The modeling part seems to be the easiest now, but building the whole infrastructure around it is a challenge. Here are the challenges that appear alongside the modeling part: ETL process and feature store, implementing proper monitoring of model performance and data drifts, building tools for manual error check and labeling custom data, ensuring model and data version control, and providing data scientists with flexible computing power.
These are all fields that every mature machine learning system needs to implement correctly in order to have a robust learning system. Nevertheless, the biggest challenge for machine learning is orchestrating all systems to work as one.
The lack of tooling and methodology around ML production as well as building the entire infrastructure are some of the biggest challenges preventing the technology from having a wide-ranging impact on the real world.
Overcoming the challenges and expanding the scope of applicability of ML would enable many new functionalities of the technology.
Some people know machine learning as the thing that customizes their Netflix feed, others know it from science, for example several new drugs are being developed with the help of ML. These applications are interesting but, in your opinion, what type of problems or activities is ML going to become indispensable for in the future?
Josh Tobin: Robotics is the application of ML I’m most excited about in the long term, but we may still be a while off from it becoming ubiquitous. Knowledge management and search is one of the most underrated killer apps of machine learning. People also underestimate the long tail of bespoke applications of ML in industry.
John Langford: I believe interactive machine learning has great potential in helping humans interoperate with computer devices better. The signals that we use to control compute devices are commonly ambiguous, so if we can find the right/natural ways to decode that ambiguity, things will work much better. We aren’t there yet.
I also believe that machine learning can be super-useful in healthcare in many ways. Nudging can help support healthy habits while immune system/cancer assays can help discover the right immunotherapy choices to cure people.
Stylianos Kampakis: Any kind of personalization for sure. And this can mean anything from retail (e.g. recommender systems) to precision medicine. And also robotics. Things like autonomous vehicles and drones will dominate once they are out!
Jakub Czakon: Depends on the time-frame. I think we’ll get to the automation of pretty much everything we do today, but it may take a long time. Especially if we claim that we can automate medicine today, where half-baked solutions are doomed to fail and give ML a bad rep. In the short/mid term, we should go for aiding rather than automation.
Getting back to the question, I put my money on commercial transport, early detection of common health problems, and helping the elderly (both physically and psychologically).
Tarek Amr: I like to categorize problems that ML solves into predictions, automation, and personalization. Predictions are the first examples that come to mind when thinking of machine learning. Yet, many practitioners may jump to predicting stuff without having a clear use case for how other stakeholders may use their predictions.
Automation is more clear, especially that other teams (like project managers and software engineers) already tackle similar problems on a daily basis. I believe in a post-COVID-19 world, the need for automation will increase. All companies that were financially hurt during the pandemic will start turning to automation to save cost. Furthermore, the advances in natural language processing fit well into the automation tasks.
Personalization is another common use case. But we have to remember that personalization is best suited when it is solving a problem. People think of Netflix’s prize to build a recommendation system, as if the company was just after a cool feature to add to their product, while in fact they were after solving an existential problem for their company. Netflix, during their DVD era, wanted to get their users to want a blend of expensive and inexpensive titles, or else their business model would not have scaled well if all their users asked for expensive titles only.
Bartek Roszak: Today’s machine learning is indispensable in a lot of areas such as recommendation systems, conversational systems, and monitoring systems.
In the future, I expect that the human race will try to move forward with space exploration as there are more and more reasons to do that. We will need more intelligent robots to replace humans in certain work conditions, such as dealing with radiation, and work independently without human intervention. To establish a station in deep space, on the Moon or Mars, we will need a lot of advanced machine learning systems that are able to operate without our intervention.
Some of the most exciting areas where ML will become indispensable are:
In a recent podcast, David Patterson stated that Moore’s law has stopped, and machines aren’t developing at a break-neck pace anymore (the same performance increase that used to happen over a few months will now take 10–20 years). He goes on to add that now, the main performance increase is going to come from domain-specific acceleration. Other experts have been warning that current machine learning models are too inefficient, wasting a lot of energy and server capacity—thus the introduction of MLPerf metrics. In light of this information, what do you think is going to be the biggest game-changer for the field of ML in the near future?
Josh Tobin: I wouldn’t bet against the ability of ML researchers to continue to build better models primarily through scale. I think the more likely bottleneck is the cost of labeled data, which is why unsupervised learning and synthetic data are such exciting research directions.
John Langford: I expect gains in efficiency of ML to provide some value. However, the game-changer in my mind is algorithms for interactive learning. Most of machine learning is based on supervised learning approaches where you know the right answer and implicitly all the wrong answers, as well as how wrong they are.
Natural real-world problems commonly do not have this structure. Instead, they look more like reinforcement learning. Mastering these areas requires significantly more thought, care, and algorithmic devices, but we are really getting there.
Stylianos Kampakis: All these are absolutely correct. I think a potential big game changer would be the creation of a new set of algorithms which can learn more efficiently from data. Current approaches are very data hungry and slow.
Humans, on the other hand, can learn only from a few examples. So, we need “smarter” algorithms, which do not need 10 GPUs and 5 terabytes of data to run successfully! I think we might see a shift in this direction within the next few years.
Jakub Czakon: I think we should get back to fundamentals first and make sure that we are building things that are valuable for people, and not just interesting tech. I don’t believe tech is inherently good or bad, there are asymmetries.
Deep fakes for text/voice/image or video will be used more by bad actors. Algorithmic bias in systems that have feedback loops is a real and huge problem. And yet, we have bias in thinking that an algorithm is based on data, so it has to be fair.
We cannot just say, “Yeah, it’s not used properly,” and go on with our day training models and reading papers. Sometimes the best solution to building a model that brings value to society is not to build it. So I don’t think the problem we have is in the speed of building, but rather in what we are building/researching.
I like this mental experiment with an urn of innovation (first heard from Nick Bostrom). Say every innovation is a ball. Green is clearly good, red is clearly bad, and yellow is somewhere in the middle. Throughout history, we mostly found green balls, and so we sped up the process of taking out new balls. We found some tricky yellow ones like nuclear energy but, luckily for us, producing a nuclear bomb is very difficult. But say there is some innovation that we can find that can cause as much damage as a nuclear bomb, but it takes a potato, water, and a $400 laptop to build. We may be in trouble.
We should start thinking whether removing all balls from the innovation urn as fast as we can is the right way forward. At some point, especially if we don’t think about it, we may stumble upon a blood-red ball.
Tarek Amr: As mentioned earlier, machine learning broke up with academia to marry big tech. Its future is clearly in FAANG’s big pockets given its reliance on humongous data and unaffordable processing power. Thus, it is clear that the current game changer is the ability to accumulate data and the affordance of stronger machines.
Will this change in the future? Well, GPU’s proved to be useful in speeding up training times. They are still expensive to use, but like any other technology, they are expected to become cheaper in the future. The other bottleneck comes from ML algorithms. Plenty of the widely used algorithms nowadays are non-parallelizable. When we hit a processing limit, the industry moved to parallelization, and the machine learning algorithms need to follow this trend to be scalable and also affordable. Besides the processing power, data is the second element where big tech excels. Companies need to learn how to share their data to match the data richness FAANG have. The industry also needs to make much bigger leaps at the data storage front where modernization steps are too shy to meet today's needs.
Bartek Roszak: From my perspective, multimodal neural nets take advantage of different types of data like structured data, text data, image data, or even audio data. Every company now collects as much data as possible. If a company wants to be truly data-driven, it needs to utilize and combine every information they need in one model. This is a field that was not explored to a great extent so I expect that we will see a lot of breakthroughs there. Combining all data owned by a company in one model has great potential to be a game-changer in machine learning.
The biggest game changers for ML in the future might come from:
During his time as Benevolent Dictator of Python, Guido van Rossum always focused on making the language as readable and easy to learn as possible. But he recently stated that he no longer believes programming is a basic skill that everybody should learn. On the other hand, if all industries are to become digitized, one of the key roles is going to be programming robots/automation and maintaining code. What is your opinion—can programming still rise up to become a basic skill? Will AI make programming even more obscure?
Josh Tobin: I think it will be somewhere between those two futures. Many jobs will be “programming” in the sense that they involve programming a computer to perform some task repeatedly, but “programming” in the sense of writing explicit code will be rarer than interacting with an AI system to teach it what task needs to be solved.
John Langford: I believe programming is an excellent basic skill and have worked to teach my children to program. It’s a skill that everyone should have some exposure to because the algorithmic viewpoint helps you decompose complex tasks into simple ones and get things done in real life.
And let’s not forget debugging: learning how to debug your own code is a great life skill since you also learn how to debug your own thinking.
On the other hand, machine learning provides a new kind of programming—learning by demonstration is the crudest understanding of this, but it's more like “learning from experience” in general.
Tasks that can be solved by a human-readable language probably should be, so in the future I expect complex mixtures of learning and coding to be the norm. An example I’ve worked on in this direction is in this paper: A Credit Assignment Compiler for Joint Prediction.
Stylianos Kampakis: I think that learning how to code now is easier than ever. That being said, there is also a strong movement towards NoCode solutions. So, anyone can develop an app, without knowing how to code.
I think we will see more of that in the near future. While coding is more accessible than ever, it might also become less important, as NoCode solutions dominate the market.
Jakub Czakon: I think programming and software development are two different things but people often think they are the same.
Programming, which can be as simple as hacking something around, automating something that you hate doing will be valuable. I think everyone would be better off after reading “Automate the Boring Stuff with Python.” It’s like Excel, or email, or stats. If we all had a decent understanding of those things our society would be a tiny bit better, I believe.
Now software development is an entirely different thing. It takes understanding the system approach, tests, infrastructure, fail-checks and about a million other things. I don’t think we should all be software devs.
Tarek Amr: I remember once teaching Python to a classroom of 12-year old children. They must be in their early 20s now. I am sure most of them did not end up studying computer science. Maybe none of them does. But I am sure programming opens their minds to different ways of thinking the same way maths and even music does. That’s why I favor Guido van Rossum’s initial stance, that programming languages should be as readable and easy to learn as possible.
A couple of weeks ago, we saw people on the internet using GPT-3 to automate writing HTML code and SQL queries. Give it some time, and GPT-3 combined with AutoML will start building machine learning models based on the stakeholders’ problem description. A frontend developer, whose job is to build a web page exactly as outlined in a Jira story, should definitely worry lest his job be automated soon. Same for a machine learning practitioner who is waiting for other stakeholders to explain the solution needed, not the problem to be solved.
In reality, the job of a software engineer, like that of a machine learning engineer, is about solving problems and thinking of the bigger picture. It’s less about writing the code, and more about building the architecture, knowing where to find the data, deciding which solution scales better, and much more. These aspects are harder to automate at this moment. They may be automated in the future for sure, hard to tell, but not in the very near future at least. In brief, programming robots will not automate anyone’s job, but will make everyone's job more productive. And by the way, the phrase “making someone's job more productive” is a nicer way of saying “making companies rely on fewer employees.”
Programming is an excellent skill because the algorithmic viewpoint helps you decompose complex tasks into simple ones and get things done in real life. Learning how to code is easier than ever and it has great benefits.
Not everyone has to become a software developer, with all of the additional knowledge and skills necessary for that job—but basic programming knowledge will be increasingly important.
As for the business side, NoCode solutions are already prominent and will only get better with ML, so building software by telling AI what kind of program you need is going to become the norm.
That’s it for now, and I don’t know about you, but for me this journey into the ML world was very enlightening. Hope you enjoyed it as much as I did!
Thank you to Josh, John, Stylianos, Jakub, Tarek, and Bartek for providing us with rich insights into the fascinating domain of machine learning.
If you need an expert team for a machine learning project, tell us about your project!