Deepmind google


DeepMind

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Нейронную сеть научили практически идеально копировать человеческий голос

В прошлом году компания DeepMind, занимающаяся разработками технологий искусственного интеллекта, поделилась деталями о своем новом проекте WaveNet – нейронной сети глубинного обучения, использующейся для синтезации реалистичной человеческой речи. На днях была выпущена усовершенствованная версия этой технологии, которая будет использоваться в качестве основы цифрового мобильного ассистента Google Assistant.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Команда Google DeepMind создала группу для обучения ИИ этике

Корпорация Google уже давно ведет разработки собственного искусственного интеллекта под названием DeepMind. Для дальнейшего развития своей технологии, как пишет издание Engadget, эксперты, работающие над ИИ, организовали группу, которая будет изучать моральные вопросы развития искусственного разума.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Роботы
  4. Роботы-убийцы? Остановитесь, даже хорошие роботы непредсказуемы до ужаса

Руководители более сотни ведущих мировых компаний в области искусственного интеллекта очень обеспокоены развитием «роботов-убийц». В открытом письме ООН эти бизнес-лидеры, включая Илона Маска из Tesla и основателей гугловской DeepMind – предупредили, что использование технологий автономного оружия может быть принято террористами и деспотами, либо в той или иной степени подвергнется взлому.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Искусственный интеллект далеко не такой умный, каким вы и Илон Маск его считаете

В марте 2016 года компьютерный алгоритм AlphaGo компании DeepMind смог одержать победу над Ли Седолем, на тот момент лучшим в мире игроком в сложную логическую го. Это событие стало одним из тех определяющих моментов в истории технологической индустрии, коими в свое время стали и победа компьютера Deep Blue компании IBM над чемпионом мира по шахматам Гарри Каспаровым, и победа суперкомпьютера Watson от той же IBM в викторине для эрудитов Jeopardy в 2011 году.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. #видео | Google научила искусственный интеллект паркуру

В последние годы исследователи из разных компаний активно развивают такую многообещающую технологию, как искусственный интеллект. Компания Google является одним из бесспорных лидеров в AI-гонке и всеми силами работает над улучшением своего искусственного интеллекта DeepMind. В рамках одного из последних исследований инженеры компании обучили искусственный интеллект преодолевать различные препятствия в виртуальной среде, отчего на свет родился своеобразный виртуальный паркурщик.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Исследования
  4. DeepMind учит свой ИИ думать по-человечески

В прошлом году искусственный интеллект AlphaGo впервые победил чемпиона мира в игре го. Эта победа стала беспрецедентной и неожиданной, учитывая высокую сложность китайской настольной игры. Хотя победа AlphaGo была определенно впечатляющей, этот искусственный интеллект, с тех пор обыгравший и других чемпионов го, все еще считается «узким» типом ИИ — который может превзойти человека лишь в ограниченном поле задач.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Компьютеры
  4. Игры кончились: AlphaGo займется решением реальных мировых проблем

В прошлом месяце человечество проиграло важную битву с искусственным интеллектом — тогда AlphaGo обыграл чемпиона по го Ки Дже со счетом 3:0. AlphaGo — это программа с искусственным интеллектом, разработанная DeepMind, частью родительской компании Google Alphabet. В прошлом году она обыграла другого чемпиона, Ли Седоля, со счетом 4:1, но с тех пор существенно набрала по очкам.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. ИИ от Google научили быть «крайне агрессивным» при стрессовых ситуациях

В прошлом году знаменитый физик-теоретик Стивен Хокинг заявил, что совершенствование искусственного интеллекта станет «либо лучшим, либо худшим событием для всего человечества». Все мы смотрели «Терминатора» и все мы отлично представляем, каким апокалиптическим адом может стать наше существование, если такая обладающая самосознанием ИИ-система, как «Скайнет», однажды решит, что в человечестве она больше не нуждается. И последние результаты работы новой ИИ-системы от компании DeepMind (принадлежит Google) лишь очередной раз напоминают нам о необходимости быть крайне осторожными при производстве роботов будущего.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Исследования
  4. Искусственный интеллект Google DeepMind получил «ускоритель» процесса обучения

Как гласит народная мудрость, «ученье свет, а неученье тьма». Видимо, этого же принципа придерживаются и специалисты компании Google, ответственные за развитие DeepMind. Им, скорее всего, показалось, что ИИ недостаточно быстро усваивает новую информацию и приобретает новые навыки, поэтому они разработали алгоритм, ускоряющий процессы распознавания, узнавания и систематизации новых знаний.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Исследования
  4. Искусственный интеллект Google DeepMind научился читать по губам

В последнее время искусственный интеллект Google DeepMind с поразительной быстротой обучается все новым и новым функциям. Он уже начал развлекаться, играя в видеоигры, да и вообще, для самообучения уже практически не нуждается в человеке. Однако нет предела совершенству, и не так давно ИИ приобрел еще один новый навык: чтение по губам.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Искусственный интеллект Google DeepMind вскоре сразится против людей в игре Starcraft II

В последнее время система искусственного интеллекта от Google под названием DeepMind все чаще мелькает в заголовках различных техноизданий, обучаясь все новым «фишкам» и находя применение во всех новых отраслях робототехники. Но желание научить ИИ играть в видеоигры — сравнительно новый тренд среди разработчиков. Совсем недавно мы писали о том, что искусственный разум научился и даже обыграл человека в DooM, а вот теперь Google обучает свою систему игре в StarCraft II.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Искусственный интеллект DeepMind больше не нуждается в людях

Искусственный интеллект DeepMind, разработанный специалистами Google, отныне не нуждается в поддержке своих создателей для того, чтобы развиваться и самосовершенствоваться дальше. Добиться этого удалось при помощи внедрения в него новой системы Differential Neural Computer (DNC), которая сочетает в себе способность компьютеров хранить большие объёмы информации, логические навыки искусственного интеллекта, а также умение нейронной сети быстро искать в хранилище данных необходимые фрагменты.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. ИИ AlphaGo от Google вновь сойдется в схватке, но уже против лучшего в мире игрока в го

Человечеству дан еще один шанс себя показать. Стало известно, что ИИ компании Google, победивший ранее в этом году одного из лучших игроков в настольную логическую игру го южнокорейца Ли Седола, до конца года вновь сразится, но уже с лучшим в мире игроком в го, 18-летним китайцем Кэ Цзе.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Google не хочет однажды создать «Скайнет», поэтому создает выключатель для ИИ

В вопросах и рассуждениях о будущем искусственного интеллекта есть две основные противоборствующие стороны. В одном углу находятся такие компании, как Google, Facebook, Amazon и Microsoft, «агрессивно» инвестирующие в развитие технологий, чтобы сделать ИИ-системы умнее, в другом — такие выдающиеся мыслители нашего времени, как Элон Маск и Стивен Хокинг, которые считают, что развитие ИИ сродни «заклинанию по вызову демона».

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. ИИ AlphaGo от Deep Mind обыграл чемпиона мира по логической игре го

В сфере развития искусственного интеллекта произошло весьма значимое событие. Программа AlphaGo, разработанная подразделением Google, компанией DeepMind, одержала победу над мировым чемпионом в логическую настольную игру го, корейцем Ли Си Долом, в рамках первой из пяти исторических партий, которые пройдут в Сеуле. Первую партию Ли проиграл спустя три с половиной часа игры, в то время как на часах оставалось еще 28 минут и 28 секунд до конца.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Компьютеры
  4. Зачем Google умнейший компьютер, который сможет программировать сам себя?

Секретные исследователи искусственного интеллекта в Google рассказали о компьютере, который, как они надеются, однажды сможет программировать сам себя. Разработчики загадочного стартапа Deep Mind, который был куплен компанией Google за 400 миллионов долларов в начале этого года, пытаются имитировать определенные свойства кратковременной памяти рабочего человеческого мозга.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Технологии
  4. Внутри загадочного «совета по этике» Google

На прошлой неделе мир технологий загудел, когда Google объявила о трате почти полумиллиарда долларов на компанию DeepMind, британского разработчика искусственного интеллекта. Представляем вам размышления западных экспертов Forbes о том, какие последствия может вызвать дальнейшее развитие событий.

Читать далее →

  1. Hi-News.ru
  2. Темы
  3. Бизнес и аналитика
  4. Что Google хочет от DeepMind?

Похоже, что Google неумолимо хочет контролировать каждый аспект нашей жизни. Пожалуй, нам стоит собраться и обсудить, к чему приведет недавний выход Google на арену искусственного интеллекта.

Читать далее →

hi-news.ru

DeepMind - Wikipedia

DeepMind Technologies Limited is a British artificial intelligence company founded in September 2010.

Acquired by Google in 2014, the company has created a neural network that learns how to play video games in a fashion similar to that of humans,[4] as well as a Neural Turing machine,[5] or a neural network that may be able to access an external memory like a conventional Turing machine, resulting in a computer that mimics the short-term memory of the human brain.[6][7]

The company made headlines in 2016 in nature after its AlphaGo program beat a human professional Go player for the first time in October 2015.[8] and again when AlphaGo beat Lee Sedol the world champion in a five game tournament, which was the subject of a documentary film.

History[edit]

The start-up was founded by Demis Hassabis, Shane Legg and Mustafa Suleyman in 2010.[9][10] Hassabis and Legg first met at University College London's Gatsby Computational Neuroscience Unit.[11] On 26 January 2014, Google announced the company had acquired DeepMind for $500 million,[12][13][14][15][16][17] and that it had agreed to take over DeepMind Technologies.

Since then major venture capital firms Horizons Ventures and Founders Fund have invested in the company,[18] as well as entrepreneurs Scott Banister[19] and Elon Musk.[20]Jaan Tallinn was an early investor and an adviser to the company.[21] The sale to Google took place after Facebook reportedly ended negotiations with DeepMind Technologies in 2013.[22] The company was afterwards renamed Google DeepMind and kept that name for about two years.[2]

In 2014, DeepMind received the "Company of the Year" award by Cambridge Computer Laboratory.[23]

In September 2015, DeepMind and the Royal Free NHS Trust signed their initial Information Sharing Agreement(ISA) to co-develop a clinical task management app, Streams.[24]

After Google's acquisition the company established an artificial intelligence ethics board.[25] The ethics board for AI research remains a mystery, with both Google and DeepMind declining to reveal who sits on the board.[26] DeepMind, together with Amazon, Google, Facebook, IBM, and Microsoft, is a founding member of Partnership on AI, an organization devoted to the society-AI interface.[27] DeepMind has opened a new unit called DeepMind Ethics and Society and focused on the ethical and societal questions raised by artificial intelligence featuring prominent transhumanist Nick Bostrom as advisor[28]. In October 2017, Deepmind launched new 'ethics and society' research team to investigate AI ethics.[29][30]

Machine learning[edit]

DeepMind Technologies' goal is to "solve intelligence",[31] which they are trying to achieve by combining "the best techniques from machine learning and systems neuroscience to build powerful general-purpose learning algorithms".[31] They are trying to formalize intelligence[32] in order to not only implement it into machines, but also understand the human brain, as Demis Hassabis explains:

[...] attempting to distil intelligence into an algorithmic construct may prove to be the best path to understanding some of the enduring mysteries of our minds.[33]

Google Research has released a paper in 2016 regarding AI Safety and avoiding undesirable behaviour during the AI learning process.[34] Deepmind has also released several publications via their website.[35]

To date, the company has published research on computer systems that are able to play games, and developing these systems, ranging from strategy games such as Go[36] to arcade games. According to Shane Legg human-level machine intelligence can be achieved "when a machine can learn to play a really wide range of games from perceptual stream input and output, and transfer understanding across games[...]."[37] Research describing an AI playing seven different Atari 2600 video games (the Pong game in Video Olympics, Breakout, Space Invaders, Seaquest, Beamrider, Enduro, and Q*bert) reportedly led to their acquisition by Google.[4] Hassabis has mentioned the popular e-sport game StarCraft as a possible future challenge, since it requires a high level of strategic thinking and handling imperfect information.[38]

Deep reinforcement learning[edit]

As opposed to other AIs, such as IBM's Deep Blue or Watson, which were developed for a pre-defined purpose and only function within its scope, DeepMind claims that their system is not pre-programmed: it learns from experience, using only raw pixels as data input. Technically it uses deep learning on a convolutional neural network, with a novel form of Q-learning, a form of model-free reinforcement learning.[2][39] They test the system on video games, notably early arcade games, such as Space Invaders or Breakout.[39][40] Without altering the code, the AI begins to understand how to play the game, and after some time plays, for a few games (most notably Breakout), a more efficient game than any human ever could.[40]

For most games (Space Invaders, Ms Pac-Man, Q*Bert for example), DeepMind plays below the current[when?] World Record. The application of DeepMind's AI to video games is currently[when?] for games made in the 1970s and 1980s, with work being done on more complex 3D games such as Doom, which first appeared in the early 1990s.[40]

AlphaGo[edit]

In October 2015, a computer Go program called AlphaGo, powered by DeepMind, beat the European Go champion Fan Hui, a 2 dan (out of 9 dan possible) professional, five to zero.[41] This is the first time an artificial intelligence (AI) defeated a professional Go player.[8] Previously, computers were only known to have played Go at "amateur" level.[41][42] Go is considered much more difficult for computers to win compared to other games like chess, due to the much larger number of possibilities, making it prohibitively difficult for traditional AI methods such as brute-force.[41][42] In March 2016 it beat Lee Sedol—a 9th dan Go player and one of the highest ranked players in the world—with 4-1 in a five-game match. In the 2017 Future of Go Summit, AlphaGo won a three-game match with Ke Jie, who at the time continuously held the world No. 1 ranking for two years. [43][44] It used a supervised learning protocol, studying large numbers of games played by humans against each other.[45]

In 2017 an improved version, AlphaGo Zero, defeated AlphaGo 100 games to 0. Zero discovered on its own many of the moves of human Go players and added new ones.

Technology[edit]

AlphaGo used two deep neural networks: a policy network to evaluate move probabilities and a value network to assess positions. The policy network trained via supervised learning, and was subsequently refined by policy-gradient reinforcement learning. The value network learned to predict winners of games played by the policy network against itself. After training these networks employed a lookahead Monte Carlo tree search (MCTS), using the policy network to identify candidate high-probability moves, while the value network (in conjunction with Monte Carlo rollouts using a fast rollout policy) evaluated tree positions.[46]

Zero trained using reinforcement learning in which the system played millions of games against itself. Its only guide was to increase its win rate. It did so without learning from games played by humans. Its only input features are the black and white stones from the board. It uses a single neural network, rather than separate policy and value networks. Its simplified tree search relies upon this neural network to evaluate positions and sample moves, without Monte Carlo rollouts. A new reinforcement learning algorithm incorporates lookahead search inside the training loop.[46] AlphaGo Zero employed around 15 people and millions in computing resources.[47] Ultimately, it needed much less computing power than AlphaGo, running on four specialized AI processors (Google TPUs), instead of AlphaGo's 48.[48]

Healthcare[edit]

In July 2016, a collaboration between DeepMind and Moorfields Eye Hospital was announced.[49] DeepMind would be applied to the analysis of anonymised eye scans, searching for early signs of diseases leading to blindness.

In August 2016, a research programme with University College London Hospital was announced with the aim of developing an algorithm that can automatically differentiate between healthy and cancerous tissues in head and neck areas.[50]

There are also projects with the Royal Free London NHS Foundation Trust and Imperial College Healthcare NHS Trust to develop new clinical mobile apps linked to electronic patient records.[51]

Controversies[edit]

In April 2016 New Scientist obtained a copy of a data-sharing agreement between DeepMind and the Royal Free London NHS Foundation Trust. The latter operates the three London hospitals where an estimated 1.6 million patients are treated annually. The revelation has exposed the ease with which private companies can obtain highly sensitive medical information without patient consent. The agreement shows DeepMind Health had access to admissions, discharge and transfer data, accident and emergency, pathology and radiology, and critical care at these hospitals. This included personal details such as whether patients had been diagnosed with HIV, suffered from depression or had ever undergone an abortion in order to conduct research to seek better outcomes in various health conditions.[52][53] The agreement is seen as controversial and its legality has been questioned.[26]

The concerns were widely reported and have led to a complaint to the Information Commissioner's Office (ICO), arguing that the data should be pseudonymised and encrypted.[54]

In May 2016, New Scientist published a further article claiming that the project had failed to secure approval from the Confidentiality Advisory Group of the Medicines and Healthcare Products Regulatory Agency.[55]

In May 2017, Sky News published a leaked letter from the National Data Guardian, Dame Fiona Caldicott, revealing that in her "considered opinion" the data sharing agreement between DeepMind and the Royal Free took place on an "inappropriate legal basis".[56]

The Information Commissioner’s Office ruled that London’s Royal Free hospital failed to comply with the Data Protection Act when it handed over personal data of 1.6 million patients to DeepMind. [57]

See also[edit]

References[edit]

  1. ^ "DEEPMIND TECHNOLOGIES LIMITED – Overview (free company information from Companies House)". Companies House. Retrieved 2016-03-13. 
  2. ^ a b c Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David (26 February 2015). "Human-level control through deep reinforcement learning". Nature. 518 (7540): 529–33. Bibcode:2015Natur.518..529M. PMID 25719670. doi:10.1038/nature14236. Retrieved 25 February 2015. 
  3. ^ "Alphabet's DeepMind unit could be expanded to 1,000 people". 
  4. ^ a b "The Last AI Breakthrough DeepMind Made Before Google Bought It". The Physics arXiv Blog. Retrieved 12 October 2014. 
  5. ^ Graves, Alex; Wayne, Greg; Danihelka, Ivo (2014). "Neural Turing Machines". arXiv:1410.5401  [cs.NE]. 
  6. ^ Best of 2014: Google's Secretive DeepMind Startup Unveils a "Neural Turing Machine", MIT Technology Review
  7. ^ Graves, Alex; Wayne, Greg; Reynolds, Malcolm; Harley, Tim; Danihelka, Ivo; Grabska-Barwińska, Agnieszka; Colmenarejo, Sergio Gómez; Grefenstette, Edward; Ramalho, Tiago (2016-10-12). "Hybrid computing using a neural network with dynamic external memory". Nature. 538: 471–476. ISSN 1476-4687. PMID 27732574. doi:10.1038/nature20101. 
  8. ^ a b "Première défaite d’un professionnel du go contre une intelligence artificielle". Le Monde (in French). 27 January 2016. 
  9. ^ "Google Buys U.K. Artificial Intelligence Company DeepMind". Bloomberg. 27 January 2014. Retrieved 13 November 2014. 
  10. ^ "Google makes £400m move in quest for artificial intelligence". Financial Times. 27 January 2014. Retrieved 13 November 2014. 
  11. ^ "Demis Hassabis: 15 facts about the DeepMind Technologies founder". The Guardian. Retrieved 12 October 2014. 
  12. ^ "Google to buy artificial intelligence company DeepMind". Reuters. 26 January 2014. Retrieved 12 October 2014. 
  13. ^ "Google Acquires UK AI startup Deepmind". The Guardian. Retrieved 27 January 2014. 
  14. ^ "Report of Acquisition, TechCrunch". TechCrunch. Retrieved 27 January 2014. 
  15. ^ Oreskovic, Alexei. "Reuters Report". Reuters. Retrieved 27 January 2014. 
  16. ^ "Google Acquires Artificial Intelligence Start-Up DeepMind". The Verge. Retrieved 27 January 2014. 
  17. ^ "Google acquires AI pioneer DeepMind Technologies". Ars Technica. Retrieved 27 January 2014. 
  18. ^ "DeepMind buy heralds rise of the machines". Financial Times. Retrieved 14 October 2014. 
  19. ^ "DeepMind Technologies Investors". Retrieved 12 October 2014. 
  20. ^ Cuthbertson, Anthony. "Elon Musk: Artificial Intelligence 'Potentially More Dangerous Than Nukes'". International Business Times UK. 
  21. ^ "Recode.net – DeepMind Technologies Acquisition". Retrieved 27 January 2014. 
  22. ^ "Google beats Facebook for Acquisition of DeepMind Technologies". Retrieved 27 January 2014. 
  23. ^ "Hall of Fame Awards: To celebrate the success of companies founded by Computer Laboratory graduates.". University of Cambridge. Retrieved 12 October 2014. 
  24. ^ Lomas, Natasha. "Documents detail DeepMind’s plan to apply AI to NHS data in 2015". TechCrunch. Retrieved 2017-09-26. 
  25. ^ "Inside Google's Mysterious Ethics Board". Forbes. 3 February 2014. Retrieved 12 October 2014. 
  26. ^ a b Ramesh, Randeep (2016-05-04). "Google's DeepMind shouldn't suck up our NHS records in secret". TheGuardian.com. The Guardian. Archived from the original on 2016-10-13. Retrieved 19 October 2016. 
  27. ^ "Home/ Partnership on Artificial Intelligence to Benefit People and Society". 2016. Retrieved 15 October 2016. 
  28. ^ Hern, Alex (4 October 2017). "DeepMind announces ethics group to focus on problems of AI" – via www.theguardian.com. 
  29. ^ "DeepMind has launched a new 'ethics and society' research team". Business Insider. Retrieved 2017-10-25. 
  30. ^ "DeepMind launches new research team to investigate AI ethics". The Verge. Retrieved 2017-10-25. 
  31. ^ a b "DeepMind Technologies Website". DeepMind Technologies. Retrieved 11 October 2014. 
  32. ^ Legg, Shane; Veness, Joel (29 September 2011). "An Approximation of the Universal Intelligence Measure". arXiv:1109.5951  [cs.AI]. 
  33. ^ Hassabis, Demis (23 February 2012). "Model the brain's algorithms" (PDF). Nature. Retrieved 12 October 2014. 
  34. ^ Amodei, Dario; Olah, Chris; Steinhardt, Jacob; Christiano, Paul; Schulman, John; Mané, Dan (2016-06-21). "Concrete Problems in AI Safety". arXiv:1606.06565  [cs.AI]. 
  35. ^ "Publications | DeepMind". DeepMind. Retrieved 2016-09-11. 
  36. ^ Huang, Shih-Chieh; Müller, Martin (12 July 2014). "Investigating the Limits of Monte-Carlo Tree Search Methods in Computer Go". Lecture Notes in Computer Science. Lecture Notes in Computer Science. Springer. 8427: 39–48. ISBN 978-3-319-09164-8. doi:10.1007/978-3-319-09165-5_4. 
  37. ^ "Q&A with Shane Legg on risks from AI". 17 June 2011. Retrieved 12 October 2014. 
  38. ^ "DeepMind founder Demis Hassabis on how AI will shape the future". The Verge. 10 March 2016. 
  39. ^ a b Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Graves, Alex; Antonoglou, Ioannis; Wierstra, Daan; Riedmiller, Martin (12 December 2013). "Playing Atari with Deep Reinforcement Learning". arXiv:1312.5602  [cs.LG]. 
  40. ^ a b c Deepmind artificial intelligence @ FDOT14. 19 April 2014. 
  41. ^ a b c "Google achieves AI 'breakthrough' by beating Go champion". BBC News. 27 January 2016. 
  42. ^ a b "Research Blog: AlphaGo: Mastering the ancient game of Go with Machine Learning". Google Research Blog. 27 January 2016. 
  43. ^ "World's Go Player Ratings". May 2017. 
  44. ^ "柯洁迎19岁生日 雄踞人类世界排名第一已两年" (in Chinese). May 2017. 
  45. ^ "The latest AI can work things out without being taught". The Economist. Retrieved 2017-10-19. 
  46. ^ a b Silver, David; Schrittwieser, Julian; Simonyan, Karen; Antonoglou, Ioannis; Huang, Aja; Guez, Arthur; Hubert, Thomas; Baker, Lucas; Lai, Matthew (2017-10-18). "Mastering the game of Go without human knowledge". Nature. 550 (7676): 354–359. ISSN 1476-4687. doi:10.1038/nature24270. 
  47. ^ Knight, Will. "The world’s smartest game-playing AI—DeepMind’s AlphaGo—just got way smarter". MIT Technology Review. Retrieved 2017-10-19. 
  48. ^ Vincent, James (October 18, 2017). "DeepMind’s Go-playing AI doesn’t need human help to beat us anymore". The Verge. Retrieved 2017-10-19. 
  49. ^ Baraniuk, Chris (6 July 2016). "Google's DeepMind to peek at NHS eye scans for disease analysis". BBC. Retrieved 6 July 2016. 
  50. ^ Baraniuk, Chris (31 August 2016). "Google DeepMind targets NHS head and neck cancer treatment". BBC. Retrieved 5 September 2016. 
  51. ^ "DeepMind announces second NHS partnership". IR Pro. 23 December 2016. Retrieved 23 December 2016. 
  52. ^ Hodson, Hal (29 April 2016). "Revealed: Google AI has access to huge haul of NHS patient data". New Scientist. 
  53. ^ "Leader: If Google has nothing to hide about NHS data, why so secretive?". New Scientist. 4 May 2016. 
  54. ^ Donnelly, Caroline (12 May 2016). "ICO probes Google DeepMind patient data-sharing deal with NHS Hospital Trust". Computer Weekly. 
  55. ^ Hodson, Hal (25 May 2016). "Did Google’s NHS patient data deal need ethical approval?". New Scientist. Retrieved 28 May 2016. 
  56. ^ Martin, Alexander J (15 May 2017). "Google received 1.6 million NHS patients' data on an 'inappropriate legal basis'". Sky News. Retrieved 16 May 2017. 
  57. ^ Hern, Alex (3 July 2017). "Royal Free breached UK data law in 1.6m patient deal with Google's DeepMind" – via www.theguardian.com. 

External links[edit]

en.wikipedia.org

DeepMind открыло бесплатный доступ к виртуальной среде машинного обучения / Geektimes

На днях представители подразделения DeepMind (сейчас входит в состав холдинга Alphabet) сообщили о предоставлении бесплатного доступа разработчикам к исходному коду платформы DeepMind Lab. Это сервис машинного обучения на базе Quake III, который предназначен для обучения искусственного интеллекта. А именно — научиться решать задачи в трехмерном пространстве без вмешательства человека. Основой платформы является движок игры Quake III Arena.

Внутри игрового мира ИИ получает форму сферы и возможность летать, изучая окружающее пространство. Цель, которую ставят перед собой разработчики — научить слабую форму ИИ «понимать», что происходит и реагировать на различные ситуации, происходящие в виртуальном мире. «Персонаж» может выполнять ряд действий, перемещаться по лабиринту, изучать ближайшее окружение. «Мы стараемся разрабатывать различные формы ИИ, способные выполнять ряд задач от обычного изучения игрового мира до совершения каких-либо действий с анализом их последствий», — рассказывает Шейн Легг, главный научный сотрудник DeepMind.

Специалисты надеются, что ИИ сможет учиться методом проб и ошибок. Игры в этом случае — почти идеальный вариант. Например, ранее в DeepMind использовали (и используют сейчас) игровую консоль Atari для того, чтобы научить нейросеть выполнять последовательные действия, необходимые для игры.

Но открытый трехмерный мир, который можно изменять, представляет гораздо более перспективную среду для обучения ИИ, чем плоский мир графически простых игрушек для Atari. ИИ в трехмерном мире имеет четкие задания, которые последовательно меняются таким образом, что опыт, полученный при решении каждого предыдущего задания, оказывается полезным для ИИ в ходе решения последующего.

Плюсом трехмерного окружения является то, что с его помощью можно обучать компьютерные системы реагировать на различные проблемы, которые могут ожидать робота и в реальном мире. При помощи такого симулятора без проблем обучаются промышленные роботы. А работать с виртуальным окружением не в пример проще в некоторых случаях, чем обучать такие системы «вручную».

При этом большинство современных нейросетей разрабатывается для решения одной специфической задачи (обработка изображений, например). Разработчики новой платформы обещают, что она поможет создать универсальную форму ИИ, способную решать большое количество задач. Причем помощь людей в этом случае компьютерной системе не понадобится. Генерация окружения для нейросети происходит каждый раз в случайном порядке.

По мнению разработчиков платформы, она помогает учиться ИИ примерно так же, как обучаются дети. «Как вы или я изучали мир в детстве», — привел пример один из сотрудников DeepMind. «Сообщество специалистов по машинному обучению было всегда очень открытым. Мы публикуем около 100 статей в год, кроме того, мы открыли исходный код многих своих проектов».

Сейчас Google DeepMind открыло исходный код DeepMind Lab, выложил его на GitHub. Благодаря этому любой человек может скачать код платформы и модифицировать ее под свои нужны. Представители проекта заявляют, что новые игровые уровни подключившиеся специалисты могут создавать самостоятельно, загружая собственные проекты на GitHub. Это может помочь всему сообществу работать над достижением своей цели быстрее и эффективнее.

Этот проект — не единственный для DeepMind. В прошлом месяце его представители заключили договор о сотрудничестве с Activision Blizzard Inc. Цель — превращение среды Starcraft 2 в тестовую площадку для искусственного интеллекта. Возможно, в скором времени к этому проекту подключатся и другие разработчики игр. К слову, ИИ в игровой среде не получает над противником никакого преимущества, используя для продвижения лишь визуальную информацию, как и человек.

На практике это означает, что ИИ Google понадобится предугадывать, что в каждый конкретный момент времени делает противник, чтобы адекватно отвечать на действия «врага». Кроме того, необходимо будет быстро реагировать на то, что пошло вне плана. Все это позволит протестировать уже следующий уровень возможностей искусственного интеллекта. «В конце-концов мы хотим применить эти способности для решения общемировых проблем», — сообщил Демис Хассабис (Demis Hassabis), основатель компании Deepmind (именно ее в 2014 году купил Google, и сейчас на базе достижений купленной компании ведется разработка ИИ).

Специалисты, связанные с ИИ, высказывают осторожное одобрение проекту. «Очень хорошо то, что они дают большое количество типов окружения», — заявил сооснователь OpenAI Илья Сутскевар (Ilya Sutskevar). «Чем с большим количеством типов окружения столкнется система, тем быстрее она будет развиваться», — продолжил он. И действительно, трехмерная среда обучения ИИ содержит более 1000 уровней и типов окружения.

Зубин Гахрахмани (Zoubin Gahrahmani), профессор из Кэмбриджа, считает, что DeepMind Lab и другие платформы для усиления развития искусственного интеллекта способствуют прогрессу, допуская исследователей к разработанной среде. При этом проекты, подобные этому, достаточно прозрачны. Он также заметил, что у человека достижение определенного уровня игры занимает гораздо меньше времени, чем у компьютера. Поэтому профессор высказывает сомнение в том, что ИИ, слабую его форму, будет сложно хотя бы приблизить до уровня человека в плане скорости обучения.

geektimes.ru

Publications | DeepMind

Author AllAgapiou, JAlcicek, CAmos, DAnderson, KAndrychowicz, MAntonoglou, IApps, CArandjelović, RAssael, Y MAzar, M GBack, TBaker, LBalaguer, J D OBallard, ABanarse, DBanino, ABapst, VBarreto, ABartunov, SBattaglia, PBeattie, CBelov, DBesse, FBlackwell, SBlundell, CBlunsom, PBolton, ABotvinick, MBuesing, LBurgess, CCabi, SCain, ACant, MCarreira, JChadwick, MChen, YChiappa, SChua, ACornebise, JCzarnecki, WDabney, WDanihelka, IDe Fauw, Jde Freitas, NDe Maria, ADegris, TDenil, MDesjardins, GDieleman, SDoersch, CDyer, CErez, TEslami, AEspeholt, LEwalds, TFaulkner, RFearon, RFernando, CFidjeland, AFoerster, JFortunato, MGaffney, SGendron-Bellemare, MGeorgiev, PGlorot, XGomez Colmenarejo, SGoroshin, RGrabska-Barwinska, AGraepel, TGraves, AGreen, SGreen, TGrefenstette, EGregor, KGrewe, DGruslys, AGuez, AGülçehre, CHadsell, RHafner, RHarley, THarutyunyan, AHassabis, DHeess, NHessel, MHester, THiggins, IHill, FHillier, CHoffman, MHorgan, DHuang, AHubert, THui, FHung, CHunt, JJaderberg, MJimenez Rezende, DJonschkowski, RKalchbrenner, NKavukcuoglu, KKay, WKing, HKirkpatrick, JKočiský, TKrakovna, VKulkarni, TKumaran, DKurth-Nelson, ZKüttler, HLai, MLakshminarayanan, BLampe, TLanctot, MLeach, MLefrancq, ALegg, SLeibo, JLemmon, JLerchner, ALever, GLi, YLillicrap, TLing, WMaddison, CMakhzani, AMatthey, LMelis, GMenick, JMerel, JMilan, KMirowski, PMnih, AMnih, VModayil, JMohamed, SMoritz Hermann, KMunos, RNair, AO'Donoghue, BOrseau, LOsband, IOsindero, SOstrovski, GPal, APanneershelvam, VPaquet, UPascanu, RPetersen, SPfau, DPietquin, OPiot, BPritzel, APuigdomènech, AQuan, JRabinowitz, NRacaniere, SRae, JRamalho, TReed, SReichert, DReynolds, MRiedmiller, MRocktäschel, TRosca, MRothörl, TRuderman, ARusu, ASadik, ASantoro, ASaxton, DSchaul, TSchlinger, EScholz, JSchrittwieser, JSchulman, JSenior, ASifre, LSilver, DSimonyan, KSonnerat, NSoyer, HSrinivasan, PSriram, SStachenfeld, KStepleton, TSuleyman, MSummerfield, CSunehag, PSwirszcz, GSzepesvpari, DTassa, YTB, DTeh, Y WTeplyashin, DTirumala, DTuyls, KUria, BValdés, Vvan den Driessche, Gvan den Oord, Avan Hasselt, HVecerik, MVeness, JVezhnevets, AVinyals, OViola, FWainwright, MWang, FWang, JWang, ZWard, TWarde-Farley, DWayne, GWeber, TWeinstein, AWhiteson, SWierstra, DYeo, MYogatama, DYork, SZambaldi, VZhang, BZisserman, AZoran, DZwols, Y

Author arrow

  • All
  • Agapiou, J
  • Alcicek, C
  • Amos, D
  • Anderson, K
  • Andrychowicz, M
  • Antonoglou, I
  • Apps, C
  • Arandjelović, R
  • Assael, Y M
  • Azar, M G
  • Back, T
  • Baker, L
  • Balaguer, J D O
  • Ballard, A
  • Banarse, D
  • Banino, A
  • Bapst, V
  • Barreto, A
  • Bartunov, S
  • Battaglia, P
  • Beattie, C
  • Belov, D
  • Besse, F
  • Blackwell, S
  • Blundell, C
  • Blunsom, P
  • Bolton, A
  • Botvinick, M
  • Buesing, L
  • Burgess, C
  • Cabi, S
  • Cain, A
  • Cant, M
  • Carreira, J
  • Chadwick, M
  • Chen, Y
  • Chiappa, S
  • Chua, A
  • Cornebise, J
  • Czarnecki, W
  • Dabney, W
  • Danihelka, I
  • De Fauw, J
  • de Freitas, N
  • De Maria, A
  • Degris, T
  • Denil, M
  • Desjardins, G
  • Dieleman, S
  • Doersch, C
  • Dyer, C
  • Erez, T
  • Eslami, A
  • Espeholt, L
  • Ewalds, T
  • Faulkner, R
  • Fearon, R
  • Fernando, C
  • Fidjeland, A
  • Foerster, J
  • Fortunato, M
  • Gaffney, S
  • Gendron-Bellemare, M
  • Georgiev, P
  • Glorot, X
  • Gomez Colmenarejo, S
  • Goroshin, R
  • Grabska-Barwinska, A
  • Graepel, T
  • Graves, A
  • Green, S
  • Green, T
  • Grefenstette, E
  • Gregor, K
  • Grewe, D
  • Gruslys, A
  • Guez, A
  • Gülçehre, C
  • Hadsell, R
  • Hafner, R
  • Harley, T
  • Harutyunyan, A
  • Hassabis, D
  • Heess, N
  • Hessel, M
  • Hester, T
  • Higgins, I
  • Hill, F
  • Hillier, C
  • Hoffman, M
  • Horgan, D
  • Huang, A
  • Hubert, T
  • Hui, F
  • Hung, C
  • Hunt, J
  • Jaderberg, M
  • Jimenez Rezende, D
  • Jonschkowski, R
  • Kalchbrenner, N
  • Kavukcuoglu, K
  • Kay, W
  • King, H
  • Kirkpatrick, J
  • Kočiský, T
  • Krakovna, V
  • Kulkarni, T
  • Kumaran, D
  • Kurth-Nelson, Z
  • Küttler, H
  • Lai, M
  • Lakshminarayanan, B
  • Lampe, T
  • Lanctot, M
  • Leach, M
  • Lefrancq, A
  • Legg, S
  • Leibo, J
  • Lemmon, J
  • Lerchner, A
  • Lever, G
  • Li, Y
  • Lillicrap, T
  • Ling, W
  • Maddison, C
  • Makhzani, A
  • Matthey, L
  • Melis, G
  • Menick, J
  • Merel, J
  • Milan, K
  • Mirowski, P
  • Mnih, A
  • Mnih, V
  • Modayil, J
  • Mohamed, S
  • Moritz Hermann, K
  • Munos, R
  • Nair, A
  • O'Donoghue, B
  • Orseau, L
  • Osband, I
  • Osindero, S
  • Ostrovski, G
  • Pal, A
  • Panneershelvam, V
  • Paquet, U
  • Pascanu, R
  • Petersen, S
  • Pfau, D
  • Pietquin, O
  • Piot, B
  • Pritzel, A
  • Puigdomènech, A
  • Quan, J
  • Rabinowitz, N
  • Racaniere, S
  • Rae, J
  • Ramalho, T
  • Reed, S
  • Reichert, D
  • Reynolds, M
  • Riedmiller, M
  • Rocktäschel, T
  • Rosca, M
  • Rothörl, T
  • Ruderman, A
  • Rusu, A
  • Sadik, A
  • Santoro, A
  • Saxton, D
  • Schaul, T
  • Schlinger, E
  • Scholz, J
  • Schrittwieser, J
  • Schulman, J
  • Senior, A
  • Sifre, L
  • Silver, D
  • Simonyan, K
  • Sonnerat, N
  • Soyer, H
  • Srinivasan, P
  • Sriram, S
  • Stachenfeld, K
  • Stepleton, T
  • Suleyman, M
  • Summerfield, C
  • Sunehag, P
  • Swirszcz, G
  • Szepesvpari, D
  • Tassa, Y
  • TB, D
  • Teh, Y W
  • Teplyashin, D
  • Tirumala, D
  • Tuyls, K
  • Uria, B
  • Valdés, V
  • van den Driessche, G
  • van den Oord, A
  • van Hasselt, H
  • Vecerik, M
  • Veness, J
  • Vezhnevets, A
  • Vinyals, O
  • Viola, F
  • Wainwright, M
  • Wang, F
  • Wang, J
  • Wang, Z
  • Ward, T
  • Warde-Farley, D
  • Wayne, G
  • Weber, T
  • Weinstein, A
  • Whiteson, S
  • Wierstra, D
  • Yeo, M
  • Yogatama, D
  • York, S
  • Zambaldi, V
  • Zhang, B
  • Zisserman, A
  • Zoran, D
  • Zwols, Y

deepmind.com

DeepMind · GitHub

DeepMind · GitHub
  • StarCraft II Learning Environment

    Python 3,232 383 Apache-2.0 1 issue needs help Updated Oct 24, 2017
  • A tool to manage kubernetes configuration using jsonnet templates

    Python 61 4 Apache-2.0 Updated Oct 23, 2017
  • C++ 98 47 GPL-2.0 Updated Oct 21, 2017
  • A customisable 3D platform for agent-based AI research

    C 4,249 857 GPL-2.0 Updated Oct 18, 2017
  • Algebra Question Answering

  • A TensorFlow implementation of the Differentiable Neural Computer.

    Python 1,714 264 Apache-2.0 Updated Oct 3, 2017
  • TensorFlow-based neural network library

    Python 5,479 685 Apache-2.0 Updated Sep 25, 2017
  • Convolutional neural network model for video classification trained on the Kinetics dataset.

    Python 89 28 Apache-2.0 Updated Sep 9, 2017
  • 107 12 Updated Jul 25, 2017
  • Learning to Learn in TensorFlow

    Python 3,187 400 Apache-2.0 Updated Jul 25, 2017
  • Dataset to assess the disentanglement properties of unsupervised learning methods

    Jupyter Notebook 83 11 Apache-2.0 Updated Jun 2, 2017
  • Question answering dataset featured in "Teaching Machines to Read and Comprehend

    Python 918 167 Apache-2.0 Updated Apr 26, 2017
  • Lua/Torch implementation of DQN (Nature, 2015)

    Lua 130 40 Updated Apr 7, 2017
  • Dataset for the spaceship task from "Metacontrol for Adaptive Imagination-Based Optimization"

    33 15 Apache-2.0 Updated Mar 23, 2017
  • Torch interface to HDF5 library

    Lua 172 91 Updated Mar 21, 2017
  • Scripts to help with Torch package documentation

    Lua 17 11 BSD-3-Clause Updated Jan 3, 2017
  • Unsupervised Data Generated for GeoQuery and SAIL Datasets

    28 8 GPL-2.0 Updated Nov 5, 2016
  • Lua 28 20 BSD-3-Clause Updated Oct 24, 2016
  • Lua 86 12 BSD-3-Clause Updated Sep 5, 2016
  • Lua 27 18 GPL-2.0 Updated Jun 18, 2016
  • Cairo lua bindings with extensions for torch

    C 7 2 MIT Updated Jun 12, 2016
  • Lua 44 19 BSD-3-Clause Updated Apr 8, 2016
  • Cephes Mathematical Functions library wrapped for Torch

    C 27 18 Updated Mar 10, 2016
  • Lua 22 30 BSD-3-Clause Updated Mar 9, 2016
  • A pretty print library for torch and lua.

    Lua 12 13 MIT Updated Jan 8, 2016
  • Style guide for Lua code.

  • LuaJIT wrapper for PLplot

    Lua 14 3 BSD-3-Clause Updated Aug 17, 2015
  • A Lua package to detect reading of undeclared variables and creating of global variables.

    Lua 11 3 BSD-3-Clause Updated Apr 28, 2015
  • Lua 35 21 BSD-3-Clause Updated Mar 27, 2015
  • 0
    People

    This organization has no public members. You must be a member to see who’s a part of this organization.

    You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.

    github.com

    Google DeepMind's AlphaGo: How it works

    Between 9 and 15 March 2016, a five game competition took place between Lee Sedol, the second-highest ranking professional Go player, and AlphaGo, a computer program created by Google's DeepMind subsidiary. The competition was high-stake: a prize of one million dollars was put up by Google. AlphaGo won 4-1.

    How exactly did AlphaGo manage to do it? All I could figure out was that machine learning was involved. Having a PhD in machine learning myself, I decided to go through the trouble and read the paper that DeepMind published on the subject. I will do my best to explain how it works in this blog post. I also read different opinions of how much a big deal this win is, and I will have some things to say about that myself (spoiler: I think it's a pretty big deal).

    Go vs. chess

    Go and chess are very popular board games, which are similar in some respects: both are played by two players taking turns, and there is no random element involved (no dice rolling, like in backgammon).

    Source: Wired

    In 1997, Garry Kasparov was defeated by Deep Blue, a computer program written by IBM, running on a supercomputer. This was the first time that a reigning world chess champion was defeated by a computer program in tournament conditions. Superficially, AlphaGo's win against Lee Sedol can be compared to Deep Blue's win against Gary Kasparov. With the exception that AlphaGo's win came almost 20 years later. So: what's the big deal? We have to understand the differences between chess and Go.

    In chess, each player begins with 16 pieces of six different types. Each piece type moves differently. The goal of the game is to capture the opponent's king. Go starts with an empty board. At each turn, a player places a stone (the equivalent of a piece in chess) on the board. Stones all obey the same rules. The goal of the game is to capture as much territory as possible. It can therefore be argued that Go has simpler rules than chess.

    In spite of the fact that the rules of Go might appear simpler than the rules of chess, the complexity of Go is higher. At each game state, a player is faced with a choice of a greater number of possible moves compared to chess (about 250 in Go vs. 35 in chess). Also, games usually last longer: A typical game in Go might last for 150 moves vs. 80 in chess.

    via GIPHY

    Because of this, the total number of possible games of Go has been estimated at 10761, compared to 10120 for chess. Both are very large numbers: the entire universe is estimated to contain "only" about 1080 atoms. But Go is the most complex of the two games, which is also why it has been such a challenge for computers to play, until now.

    Game AI: Why Go is challenging

    To understand how AIs are capable of playing games such as chess and Go, we have to understand what a game tree is. A game tree represents game states (positions) as nodes in the tree, and possible actions as edges. The root of the tree represents the state at the beginning of the game. The next level represents the possible states after the first move, etc... For simple games such as tic-tac-toe, it is possible to represent all possible game states (the complete game tree) visually:

    Source: Wikimedia

    For more complex games, this quickly becomes impossible. For chess, the tree would contain 10120 nodes, which is totally impossible to store on a computer (remember: the universe has only ~1080 atoms).

    Knowing the complete game tree is useful for a game playing AI, because it allows the program to pick the best possible move at a given game state. This can be done with the minimax algorithm: At each game turn, the AI figures out which move would minimize the worst-case scenario. To do that, it finds the node in the tree corresponding to the current state of the game. It then picks the action that minimizes the worst possible loss it might suffer. This requires traversing the whole game tree down to nodes representing end-of-game states. The minimax algorithm therefore requires the complete game tree. Great for tic-tac-toe, but not useful for chess, and even less so for Go.

    How did Deep Blue beat Kasparov? The basic principle is that Deep Blue searched the game tree as far as possible, usually to a depth of six moves or more. It would then use an evaluation function to evaluate the quality of the nodes at that depth. Essentially, the evaluation function replaces the subtree below that node with a single value summarizing this subtree. Then, Deep Blue would proceed similarly to the minimax algorithm: The move that leads to the least bad worst-base scenario at this maximum depth is chosen.

    The evaluation function relies on some form of heuristic. It becomes easier to design good evaluation functions for moves that are closer to the end of the game. This makes intuitive sense: At the beginning of a game, it is hard to tell who is going to win, whereas toward the end of the game, it is sometimes very easy to tell who is going to win (e.g. just before a checkmate). It is probably impossible to design a perfect evaluation function, but better evaluation functions lead to better game play.

    Two factors determine the strength of the AI:

    • Raw computing power. More computing power means the game tree can be searched to a greater depth, which leads to better estimates by the evaluation function. Deep Blue ran on a supercomputer (i.e. it had massive computing power).

    • Quality of the evaluation function. IBM put an enormous amount of effort into the design of the evaluation function. According to Wikipedia:

    The evaluation function had been split into 8,000 parts, many of them designed for special positions. In the opening book there were over 4,000 positions and 700,000 grandmaster games. The endgame database contained many six piece endgames and five or fewer piece positions. Before the second match, the chess knowledge of the program was fine tuned by grandmaster Joel Benjamin. The opening library was provided by grandmasters Miguel Illescas, John Fedorowicz, and Nick de Firmian.

    In summary: In spite of the large complexity of chess, Deep Blue relied largely on brute force, plus some well-designed heuristics.

    Go cannot be tackled effectively with the same approach. Go has a wider branching factor (more possible moves at each state) than chess, and games tend to be longer. Hence, it is more difficult to search the game tree to a sufficient depth. In addition, it turns out that it is more difficult to design evaluation functions for Go than for chess. The endgame in Go is sometimes said to be especially complex. At the time of writing (March 15, 2016), Wikipedia notes:

    Thus, it is very unlikely that it will be possible to program a reasonably fast algorithm for playing the Go endgame flawlessly, let alone the whole Go game.

    In light of the recent win of AlphaGo, this prediction now seems needlessly pessimistic (and also wrong).

    Monte Carlo Tree Search to the rescue

    Monte Carlo Tree Search (MCTS) is an alternative approach to searching the game tree. The idea is to run many game simulations. Each simulation starts at the current game state and stops when the game is won by one of the two players. At first, the simulations are completely random: actions are chosen randomly at each state, for both players. At each simulation, some values are stored, such as how often each node has been visited, and how often this has led to a win. These numbers guide the later simulations in selecting actions (simulations thus become less and less random). The more simulations are executed, the more accurate these numbers become at selecting winning moves. It can be shown that as the number of simulations grows, MCTS indeed converges to optimal play.

    MCTS faces an exploration/exploitation trade-off: It can tend to focus too early (after too few simulations) on actions that seem to lead to wins. It turns out that it is better to to include an exploration component in the search, which adds a random component. We talked about the exploration/exploitation trade-off in a previous blog article, but in a different context.

    What is interesting about MCTS is that no domain knowledge or expert input about the game is required. Whereas Deep Blue used a complex evaluation function which was designed with the help of expert chess players, MCTS merely requires traversing a tree and keeping track of some numbers. Also, it is convenient that the whole game tree does not have to be expanded, as this would be impossible. However, it is necessary to run a large number of simulations in order to achieve good results.

    The strongest Go AIs (Fuego, Pachi, Zen, and Crazy Stone) all rely on MCTS. They also rely on domain knowledge (hand-crafted rules designed by experts) to better select actions during the Monte Carlo simulations. All four programs achieve strong amateur play levels.

    Beyond amateur play with learning

    All previously mentioned methods for Go-playing AI rely on some kind of tree search, combined with hand-crafted rules. AlphaGo however makes extensive use of machine learning to avoid using hand-crafted rules. Three different kinds of neural networks are combined with a tree search procedure. I will explain how the pieces fit together, but I first have to provide some background.

    What is machine learning?

    Maching learning is the art and science of designing computer algorithms that learn from data. In supervised learning (a standard setting in machine learning) an algorithm is repeatedly presented with training examples along with their corresponding labels. E.g. A training example could be the game-state of a game of Go, and the training label would be if this state ultimately led to a win or a loss for the current player. The goal is to learn a model which is able to generalize well on previously unseen examples, i.e. it should be good at predicting the outcome of as-of-yet unseen Go games.

    Neural networks

    Source: Wikimedia. Each white dot represents an artificial neuron. Each colored line represents a trainable parameter.

    Artificial neural networks are a class of models that is frequently used in machine learning, both in the supervised and the unsupervised setting, because of their ability to handle large amounts of training data. Neural networks consist of a number of layers, each of which contain a number of parameters whose values are unknown a priori and need to be trained (i.e. tuned on training data). Each layer in an artificial neural network contains artificial neurons. Each neuron receives as input the outputs of neurons in a previous layer. The inputs are then summed together (and passed through a non-linear "activation" function). This behavior is reminiscent of biological neurons, which is where the name came "neural" network came from.

    Convolutional networks

    Source: Wikimedia

    Convolutional networks are a sub-type of neural network that are especially well adapted for the processing of image data. Convolutional networks take as input an image. At each layer in the network, a series of filters is applied on the image. Convolutional networks are highly computationally efficient on image data because they restrict themselves to filtering operations, which are very useful for image data. These networks have been applied to all kinds of tasks that take images as input, such as digit, face, and licence plate recognition.

    Note that all operations are feed-forward: the output of the neural network is obtained after a series of filtering operations. No back-tracking or search procedure is applied. Intuitively, convolutional networks are well-suited for problems that can be solved quickly and intuitively by humans, such as recognizing objects in an image. They are not well suited for problems that require time and reflection, e.g. finding the exit of a maze, given an image of this maze.

    Source: Google research. Object detection. A task well-suited for convolutional networks. (Most) humans can also do this very quickly and intuitively.

    Source: Wikimedia. Maze path-finding. Not an easy problem for a convolutional network: finding the solution requires search. A human also needs some time and reflection to find the solution.

    A note on deep learning

    Deep learning has often been mentioned in media recently. The term usually refers to training neural networks in an unsupervised manner, and greedily, layer-by-layer. The convolutional networks used by AlphaGo are indeed deep (they contain 13 layers), but they are trained in a supervised manner, and not layer-by-layer, but all in one go. Hence, strictly speaking, AlphaGo does not use deep learning.

    Edit (April 6, 2016): In the comments below, both Loïc Matthey and Gary Cottrell informed me that I'm wrong about this. While the term deep learning used to refer to networks that are trained layer-by-layer in an unsupervised fashion, the term now refers to any network with a lot of layers. AlphaGo therefore does indeed use deep learning.

    AlphaGo

    AlphaGo relies on two different components: A tree search procedure, and convolutional networks that guide the tree search procedure. The convolutional networks are conceptually somewhat similar to the evaluation function in Deep Blue, except that they are learned and not designed. The tree search procedure can be regarded as a brute-force approach, whereas the convolutional networks provide a level on intuition to the game-play.

    In total, three convolutional networks are trained, of two different kinds: two policy networks and one value network. Both types of networks take as input the current game state, represented as an image (with some additional input features, which are not important to our discussion).

    Via Nature

    The value network provides an estimate of the value of the current state of the game: what is the probability of the black player to ultimately win the game, given the current state? The input to the value network is the whole game board, and the output is a single number, representing the probability of a win.

    The policy networks provide guidance regarding which action to choose, given the current state of the game. The output is a probability value for each possible legal move (i.e. the output of the network is as large as the board). Actions (moves) with higher probability values correspond to actions that have a higher chance of leading to a win.

    Training the policy and value networks

    A policy network was trained on 30 million positions from games played by human experts, available at the KGS Go server. An accuracy on a withheld test-set of 57% was achieved. When I first read the paper, I was very surprised that this was possible. I would have thought that predicting human game moves is a problem that is too difficult to be solved with convolutional networks. The fact that a convolutional network was able to predict human game play with such high accuracy seems to suggest that a lot of human Go-play is rather intuitive, instead of deeply reflective.

    A smaller policy network is trained as well. Its accuracy is much lower (24.2%), but is much faster (2 microseconds instead of 3 milliseconds: 1500 times faster).

    Deep reinforcement learning

    Until now, the policy networks were only trained to predict human moves. But the goal should not be to be as good as possible at predicting human moves. Rather, the goal should be to have networks that are optimized to win the game. The policy networks were therefore improved by letting them play against each other, using the outcome of these games as a training signal. This is called reinforcement learning, or even deep reinforcement learning (because the networks being trained are deep).

    Letting the system play against itself is a useful trick to let the system improve itself, without the need for a training set of games played by humans. This trick is not new, since it was used in TD-Gammon in 1992, created by Gerald Tesauro at IBM. TD-Gammon was a backgammon-playing program that reached the performance of the best human players at the time.

    AlphaGo's performance without search

    The AlphaGo team then tested the performance of the policy networks. At each move, they chose the actions that were predicted by the policy networks to give the highest likelihood of a win. Using this strategy, each move took only 3 ms to compute. They tested their best-performing policy network against Pachi, the strongest open-source Go program, and which relies on 100,000 simulations of MCTS at each turn. AlphaGo's policy network won 85% of the games against Pachi! I find this result truly remarkable. A fast feed-forward architecture (a convolutional network) was able to outperform a system that relies extensively on search. This again suggests that intuition is very important in the game of Go. It also shows that it is possible to play well without relying on very long reflections.

    Value network

    Then, a value network was trained on 30 million game positions obtained while the policy network played against itself. As a reminder, the value network is supposed to predict the likelihood of a win, give the current game-state. It is therefore similar to an evaluation function, with the difference that the value network is learned instead of designed.

    We said that it is difficult to design evaluation functions that perform well at the beginning of a game, but that it becomes easier the closer the game approaches an end. The same effect is observed with the value network: the value network makes random predictions at the very beginning of the game, but becomes better at predicting the final game outcome the more moves have been played. The fact that both human-designed evaluation functions and learned value networks display the tendency to perform better towards the end of the game indicates that this tendency is not due to human limitations, but that it is something more fundamental.

    Putting all pieces together: Tree search

    AlphaGo's tree search procedure is somewhat similar to MCTS, but is guided by all three types of networks in an innovative manner. I will not go into too much detail here, as the full approach is somewhat involved.

    Similarly to Deep Blue, AlphaGo uses an evaluation function to give a value estimate of a given state. AlphaGo uses a mixture of the output of the value network and the result of a self-play simulation of the fast policy network:

    value of a state = value network output + simulation result.

    This is interesting because it suggests a mixture of intuition and reflection. The value network provides the intuition, whereas the simulation result provides the reflection. The AlphaGo team also tried to use only the value network output, or only the simulation result, but those provided worse results than the combination of the two. It is also interesting that the value network output and the simulation result seem to be equally important.

    The slow policy network is also used to guide the tree search. Here, the exploration/exploitation trade-off we previously mentioned is kept in mind. Given a game state, the policy network outputs a probability value for each legal move. This output is then divided by the number of times this move was taken in this state in the simulation. This encourages exploration by penalizing actions that were chosen often.

    This concludes our description of the inner workings of the AlphaGo system.

    How good is AlphaGo?

    How strong is AlphaGo compared to other humans and AIs? The most commonly used system for comparing the strength of players is the Elo rating system. The difference in the ratings between two players serves as a predictor of the outcome of a match, where higher ratings indicate a higher chance of winning.

    In the paper, written in 2015, the strength of various AIs was estimated as follows:

    AI name Elo rating
    Distributed AlphaGo (2015) 3140
    AlphaGo (2015) 2890
    CrazyStone 1929
    Zen 1888
    Pachi 1298
    Fuego 1148
    GnuGo 431

    AlphaGo ran on 48 CPUs and 8 GPUs and the distributed version of AlphaGo ran on 1202 CPUs and 176 GPUs.

    Go Parallel

    We see that additional computational resources lead to better game play, just like with chess. Unfortunately, no estimate of the Elo rating of AlphaGo running on a single CPU was provided.

    At the time the paper was written, the distributed version of AlphaGo had defeated Fan Hui in a five-game tournament. Fan Hui is a professional 2 dan player whose Elo rating was estimated at 2,908 at the time.

    On March 15, 2016, the distributed version of AlphaGo won 4-1 against Lee Sedol, whose Elo rating is now estimated at 3,520. The distributed version of AlphaGo is now estimated at 3,586. It is unlikely that AlphaGo would have won against Lee Sedol if it had not improved since 2015.

    There is only one remaining human with a higher Elo rating than AlphaGo (Ke Jie, at 3621).

    How important are these results?

    Superficially, both Go and chess seem to be representative of typical challenges faced by AI: The decision-making task is challenging, and the search space is intractable.

    In chess, it was possible to beat the best human players with a relatively straightforward solution: brute force search, plus some heuristics hand-crafted (with great effort) by expert chess players. The fact that heuristics had to be hand-crafted is rather disappointing, because it does not immediately lead to other breakthroughs in AI: each new problem would need new hand-crafted rules. Also, it seems that it was a lucky coincidence that chess had a state space that was very large, but still small enough to be just barely tractable.

    Go, due to its higher complexity, could not be tackled with Deep Blue's approach. Good progress was made with MCTS. This is also a somewhat disappointing solution: pure MCTS does not use any domain knowledge. This means that a Go-playing program using pure MCTS does not know anything about how to play Go at the beginning of each new game. There is no learning through experience.

    What is very interesting about AlphaGo is that it involves a learning component instead of hand-crafted heuristics. By playing against itself, AlphaGo automatically gets better and better at playing Go. This provides hope that AlphaGo's approach might adapt well to other AI problems.

    Also interesting is that in spite of massive computational power (probably much more than Deep Blue), AlphaGo evaluated thousands of times fewer positions than Deep Blue. This is because the policy and value networks are very expensive to evaluate. But this trade-off is worth it: positions are evaluated more precisely, which ultimately leads to better game play.

    I think the fact that AlphaGo won against Lee Sedol is not very interesting in itself. Anyway the win itself was heavily reliant on massive computational power. Had fewer CPUs and GPUs been available, AlphaGo might have lost. What is more interesting is the approach employed, because it provides hope for the future.

    Some perspectives for the future

    A few things about AlphaGo's approach seem unsatisfying, perhaps leaving room for improvement for the future:

    • AlphaGo relies on a large dataset of games played by human players. Fortunately, a big dataset of Go games played by human experts was available, but this might not be the case for other AI problems. Also, it is possible that AlphaGo is somewhat biased toward imitating human play. If it relied only on self-play (becoming better by playing against itself), perhaps completely new strategies that are contrary to current orthodoxy might be discovered. This was the case for TD-Gammon , which relied only on self-play. According to Wikipedia:

      [TD-Gammon] explored strategies that humans had not pursued and led to advances in the theory of correct backgammon play.

    • The system is not trained end-to-end. AlphaGo has two distinct parts: the neural networks, and the tree search phase. The neural networks are trained independently of the tree search phase. Intuitively it would seem that a joint learning procedure should be possible, which might lead to a better combination of the learning and search phase.

    • AlphaGo relies on convolutional networks, a very special type of neural network, for the policy and value networks. Using convolutional networks might not be possible for all future games or AI problems in general.

    • A bunch of hyper-parameters need to be set. E.g. the system is shaped to use most time in the middle-game. It is not clear a priori why this is necessary and it would be more satisfying if this was learned automatically (if this indeed leads to better game play), instead of having to be set manually.

    What's next?

    What AI problems will the DeepMind team try to tackle next? Hard to say. Computer game AI would definitely be an interesting problem, both because of its complexity as well as for the practical applications. In fact, DeepMind has already taught computers to learn to play Atari games by themselves. Computer game AI therefore looks like a straightforward next problem to tackle.

    tastehit.com

    DeepMind AI Reduces Google Data Centre Cooling Bill by 40%

    From smartphone assistants to image recognition and translation, machine learning already helps us in our everyday lives. But it can also help us to tackle some of the world’s most challenging physical problems -- such as energy consumption.  Large-scale commercial and industrial systems like data centres consume a lot of energy, and while much has been done to stem the growth of energy use, there remains a lot more to do given the world’s increasing need for computing power.

    Reducing energy usage has been a major focus for us over the past  10 years: we have built our own super-efficient servers at Google, invented more efficient ways to cool our data centres and invested heavily in green energy sources, with the goal of being powered 100 percent by renewable energy. Compared to five years ago, we now get around 3.5 times the computing power out of the same amount of energy, and we continue to make many improvements each year.

    Major breakthroughs, however, are few and far between -- which is why we are excited to share that by applying DeepMind’s machine learning to our own Google data centres, we’ve managed to reduce the amount of energy we use for cooling by up to 40 percent. In any large scale energy-consuming environment, this would be a huge improvement. Given how sophisticated Google’s data centres are already, it’s a phenomenal step forward.

    The implications are significant for Google’s data centres, given its potential to greatly improve energy efficiency and reduce emissions overall. This will also help other companies who run on Google’s cloud to improve their own energy efficiency. While Google is only one of many data centre operators in the world, many are not powered by renewable energy as we are. Every improvement in data centre efficiency reduces total emissions into our environment and with technology like DeepMind’s, we can use machine learning to consume less energy and help address one of the biggest challenges of all -- climate change.

    One of the primary sources of energy use in the data centre environment is cooling. Just as your laptop generates a lot of heat, our data centres -- which contain servers powering Google Search, Gmail, YouTube, etc. -- also generate a lot of heat that must be removed to keep the servers running. This cooling is typically accomplished via large industrial equipment such as pumps, chillers and cooling towers. However, dynamic environments like data centres make it difficult to operate optimally for several reasons:

    1. The equipment, how we operate that equipment, and the environment interact with each other in complex, nonlinear ways. Traditional formula-based engineering and human intuition often do not capture these interactions.
    2. The system cannot adapt quickly to internal or external changes (like the weather). This is because we cannot come up with rules and heuristics for every operating scenario.
    3. Each data centre has a unique architecture and environment. A custom-tuned model for one system may not be applicable to another. Therefore, a general intelligence framework is needed to understand the data centre’s interactions.

    To address this problem, we began applying machine learning two years ago to operate our data centres more efficiently. And over the past few months, DeepMind researchers began working with Google’s data centre team to significantly improve the system’s utility. Using a system of neural networks trained on different operating scenarios and parameters within our data centres, we created a more efficient and adaptive framework to understand data centre dynamics and optimize efficiency.

    We accomplished this by taking the historical data that had already been collected by thousands of sensors within the data centre -- data such as temperatures, power, pump speeds, setpoints, etc. -- and using it to train an ensemble of deep neural networks. Since our objective was to improve data centre energy efficiency, we trained the neural networks on the average future PUE (Power Usage Effectiveness), which is defined as the ratio of the total building energy usage to the IT energy usage. We then trained two additional ensembles of deep neural networks to predict the future temperature and pressure of the data centre over the next hour. The purpose of these predictions is to simulate the recommended actions from the PUE model, to ensure that we do not go beyond any operating constraints.

    We tested our model by deploying on a live data centre. The graph below shows a typical day of testing, including when we turned the machine learning recommendations on, and when we turned them off.

    deepmind.com


    Смотрите также