On June 2, 2015, Facebook announced the opening of an artificial intelligence laboratory inParis. A first for France at the time, motivated by the conviction of the company’s Chief AI Scientist, Yann LeCun, that there was a pool of untapped talent. Five years later, Antoine Bordes, Co-Managing Director of Facebook AI Research (FAIR), takes stock of this adventure for us. He looks back at the laboratory’s landmark projects, the development of a Parisian AI ecosystem, but also the functioning of FAIR at the global level, its relationship with other Facebook divisions, and the next advances in AI research.
The Digital Factory: Five years after its launch, what is your assessment of Facebook Facebook’s AI Research laboratory in Paris?
Antoine Bordes: It’s a huge success. When we opened the lab in 2015, it had a core team of five people. We are now 80 people in Paris, which is a very strong growth in a competitive environment like AI. And this is all the more true because our recruitment standards are extremely high. They are the same in all the regions where we recruit. FAIR Paris is without a doubt one of the best artificial intelligence research laboratories in the world.
France, a mecca for artificial intelligence research, does not shock today but it was not at all obvious five years ago…
That’s right, and Facebook was also much smaller than it is today. We must honor Yann LeCun, it was he who pushed this project, he really believed in it and he was right. Today it is our largest research center with the one located in California. Other big companies like Google have followed in recent years under the leadership of the President of the Republic, but we have been pioneers. Several French expatriates have returned, myself included, there are also Americans… We now have twelve nationalities from all over continental Europe.
One of the reasons for your presence in Paris is the excellent level of research in France. How do you take advantage of that?
In five years, we have signed partnerships with all French research institutes: Inria, CNRS, PRAIRIE… We welcome more than 30 PhD students through the Cifre program, which is huge. We participate in bilateral projects and also make donations to research. In particular, we financed the supercomputer Jean Zay, which was inaugurated in January and is managed by GENCI, to the tune of 3 million euros. All of this contributes to the creation of an ecosystem, and we are proud of it. Our students tend to stay with us, but some are now at Google. And former researchers from our laboratory then leave to create start-ups in Paris. This is particularly the case for Alexandre Lebrun, who founded Nabla, a start-up specializing in health.
What are FAIR Paris’ most iconic projects so far?
More than 50 projects have been completed, but I will give you three that I think are representative. The first is the translation of unsupervised text,without a dictionary that matches the two languages. The project started in mid-2017 and the idea was a bit outlandish at the time, but it worked very well. It has been shown that it is possible. We won the award for best article at the biggest conference on the subject: EMNLP. And that responds to a real problem. There are more than 200 languages on Facebook. To do direct translation for each of them with the others, it gives 40,000 pairs of languages… It’s just impossible to collect so many data sets. For rare languages [without many existing translations to other languages], this technology is now used to make translations.
The second project is called DensePose and dates back to early 2018. It’s computer vision. It is a system of estimation of pose and intent for the human body. We follow 5000 dots on the surface of the body, which makes it possible to understand a 2D image in 3D. Changes can then be applied to them, for example to change people’s clothes. And now it can be done on deformable objects, especially animals. We will be presenting research on the subject at the CVPR 2020 conference.
The third project is called BlenderBot and is about conversations. It was published in April this year. We did self-supervised model learning, including using Reddit data. This allowed us to create a model with 9.4 billion parameters, more than three times more than the previous larger model. And we saw that we were getting some interesting performances. We have studied with specialists and it appears that it outperforms all other chatbots in terms of having interesting and natural conversations, which can look human.
How is the transfer of the technologies you develop within Facebook? Do you also intervene in the product teams?
No, our participation is minimal. It is not our core business. FAIR’s mission is to advance artificial intelligence research in transparency and openness. That’s why all our projects are done on public data and are published in their entirety in open source. And it also means being open to criticism. Of course, our searches can and are integrated into Facebook’s products. But we are not involved in product development.
Facebook also has numerous virtual and augmented reality research laboratories, combined under the name Facebook Reality Labs. Do you ever work with them?
Yes, there are a lot of collaborations with them. On DensePose for example we work a lot with them in London. We bring them technological bricks. There is also the Habitat project,which is halfway between robotics and 3D reconstruction, to train agents in virtual environments so they can navigate. It’s done in California and we’re working hand in hand with FRL on this project. They bring their environmental reconstruction technology and we part AI.
Is research progressing rapidly in the field of 3D perception?
Absolutely, we can expect a lot of progress on 3D in the years to come. They will be used for virtual and augmented reality as well as for mixing 3D and video. A few days ago, we published our latest work on the subject, which will be presented at the CVPR 2020 conference. There are some really amazing things. The challenge is to make 3D easy to understand, without special cameras or excessive computing power. We contributed to this with PyTorch3D, which we published in open source in February. I don’t want to make a prediction, but the technology to generate 3D avatars by scanning and then inserting themselves into environments in a natural way will happen faster than we think.
Did you mention self-supervised learning for BlenderBot, is the way of the future?
It’s not the future, it’s been the present for one or two years. Whether it’s for computer vision, natural language processing, or voice. In pictures, we are coming up with new work on which we have self-supervised pre-training to results as good as with models learned on ImageNet. For language, it’s been more than a year with RoBERTa. It can detect hate speech, for example. It is based on the XLM project that we released in 2019, which is an improved version of Google’s BERT model. It allows for self-supervised pre-training for multilingual text. And then he takes a text in about twenty languages and can be used to detect hate speech. For the voice, it took a little longer. We have published a project that has been conducted in California called wav2vec, which usesself-supervised learning for speech recognition. Yann LeCun has been talking about this for four or five years.
On the subject of hate speech, the moderation of Facebook’s content is regularly criticized for its mistakes or its slowness. Do advances in research suggest that this problem can be solved in the coming years?
Yes and no. We have already made a lot of progress with the model I was talking about. It was applied to hate speech — it took three months — and we had significant reductions from what existed before. We are seeing continuous progress. This is especially the case in multilingual. Previously we were pretty good in English and in other languages it was a disaster. Now the advances are being translated into other languages. But we can still turn the crank and progress will continue.
However, the limits of the machine are also touched in the interpretation of context and common sense. Let someone say something in an ironic way, without thinking it, or on the contrary that he makes an insinuation whose scope exceeds the literal text because of a cultural or historical context… A machine can’t handle that kind of subtlety. That’s why we talk about helping humans on the subject of moderation, but never replacing them completely. I don’t think that’s possible. The future I see are human moderators better equipped with AI.
There seems to be a race to greatness in recent years, with the development of ever more powerful computational centers to process the largest possible sets of data. We see this for example through OpenAI’s partnership withMicrosoft. Are you also about it?
Yes, it’s also our approach with self-supervised learning: no labels but a lot of data. So we have very large computing infrastructures and we do learning on a very large scale. There is clearly this issue of very large models, and we are very competitive on it. But I also want to note our work on small models that take much less energy and memory and can be shipped on phones. We try to remove parts of a model while keeping the same performance, to get a model that is only one gigabyte or even 100MB. We call it pattern compression.
We hear a lot about computer vision and natural language processing. Is FAIR also working on other areas of AI?
Yes, these topics are highlighted because they correspond to Facebook products but we are also working on other things. In robotics in particular, we do things very close to the applied maths on the theory of learning. These are quasi-theoretical statistics, we study how the machine learns, how biases form and how they can be suppressed, how to ensure that a system is ethical. And the third pillar is cognitive science: how humans learn, how a child manages to recognize a type of animal with only 2 images, that sort of thing. We work with cognitive science laboratories on the subject, Paris Sciences et Lettres and the laboratory of the Ecole Normale Supérieure in Paris in particular.
Since we are talking about cognition, what is your position on the issue of so-called “strong” artificial intelligence, or “Artificial General Intelligence”? Beyond the media controversies, the subject also sometimes divides researchers…
I am a scientist at the base and I am suspicious of this term which is poorly defined. There is a need to understand how research is conducted and the nature of the progress that is being made. Translation seemed to be an impossible problem to solve, and there has been tremendous progress. We can also expect great advances on 3D.
There are also very complicated things that will be made progress on, such as reasoning, and I mean complex reasoning, with dozens of steps. It is really limited today, but we are currently developing projects that link artificial intelligence with proof of program, a subject on which we are very strong in France. [This discipline involves mathematical analysis of computer code to certify the absence of bugs and vulnerabilities. It is used for critical systems, especially in space or power plants, we try to mix these techniques together, with AI that will prove these theorems themselves. My intuition tells me that having carefully defined the problem will be able to make great progress on it. But these will be reasonings in a mathematical language.
Then there is the consciousness, the common sense, the sense of the world. And that’s a recipe we don’t know about. At the moment we don’t even know how to work on the problem. Do we have to replicate what a baby does? Is it necessary to make him learn everything, combine lots of different disciplines and skills? Should we build a database with all the knowledge of the world? So far we have made very little progress. I think we are missing an ingredient, a decisive scientific breakthrough. Maybe we’ll find it, but maybe we’ll never find it.