Today we have a spell checker based on word vectors, and noising and denoising natural language to get a corpus for grammar correction.

Uploaded layout for a single post. Clicking into a post will have its headers properly displayed now.

Spell Checking

Fantastic work from @fastdotai student @er214 showing how to create a spelling checker based on word vectors.
Includes link to full jupyter notebook code walk-thru.https://t.co/2F9wROkVrn pic.twitter.com/r8V7OdFQCP

— Jeremy Howard (@jeremyphoward) May 24, 2018

Want to sound like a pretentious bore?

Well now you can, thanks to this adroit scholarly monograph from the erudite @er214, introducing the "pretentiousness vector".https://t.co/y3GTEcHr84 pic.twitter.com/TgwutK1EHZ

— Jeremy Howard (@jeremyphoward) May 24, 2018

Gramma Correction

How do you get a good corpus for grammar correction via translation? By Noising and Denoising Natural Language. @ziangx1e, Guillaume Genthial, @stan_xie, @AndrewYNg & @jurafsky #NLProc #NAACL2018 https://t.co/815ZNcpd6u pic.twitter.com/bp6HnsTsL5

— Stanford NLP Group (@stanfordnlp) May 24, 2018

Overfit with One Parameter

Tired of over-fitting your data with neural network models that have millions of parameters? Now you can over-fit with just one parameter: newly accepted paper derives a simple, one-parameter equation that can fit any scatter plot to a given precision https://t.co/TI2BRHP0mI

— steve piantadosi (@spiantado) May 23, 2018

Causal Inference and do-Calculus

Judea Pearl claims all we do in ML is curve fitting. I wrote this post to explain that claim and introduce the basics of causal inference to ML folks.

Machine Learning beyond Curve Fitting: An Intro to Causal Inference and do-Calculushttps://t.co/1osm0VcaaR

— Ferenc Huszár (@fhuszar) May 24, 2018

Until now, most of the datasets only consists of X, y variables (like MNIST).
We ought to start creating (X,z,y) datasets in order to apply Do-Calculus or further extend to casual inference. (Collecting all z is impossible in general, so under the certain restricted setting of z) https://t.co/R7bFQASiCE

— Daniel Jiwoong Im (@Daniel_J_Im) May 24, 2018

Tensorflow Eager Execution

⚡️ “Tensorflow eager execution in 12 tweets”https://t.co/WAGCKTr9yg

— Martin Görner (@martin_gorner) May 23, 2018

Measuring Schedule Strength in Sports

Slides from my talk at the Fields Sports Analytics workshop on measuring schedule strength in sports https://t.co/wUYtAHFufk pic.twitter.com/ecJwxNglVk

— Michael Lopez (@StatsbyLopez) May 24, 2018

MURA dataset

MURA (musculoskeletal radiographs) is a large dataset of bone X-rays.

Can your AI model detect abnormalities in bone X-rays as well as a radiologist?

My @Stanford lab just released a new dataset, MURA. Join our deep learning competition to see how your model compares: https://t.co/sWSklQ9ykU@pranavrajpurkar @jeremy_irvin16 @mattlungrenMD

— Andrew Ng (@AndrewYNg) May 24, 2018

Translate

Translate is an #opensource project for developing machine translation models that can be trained in @PyTorch and exported to #Caffe2 for production using ONNX. https://t.co/gqsPCUCru5

— Deep Learning London (@deeplearningldn) May 24, 2018

Translation is an amazing triumph of machine learning and language research. It made the world smaller, enabling commerce and improving communication, essential for peace. Kudos to @facebook for opening this tech so others in Africa etc can tune it to their needs. Bravo @ylecun https://t.co/2DQaocvYOf

— Nando de Freitas (@NandoDF) May 24, 2018

Pandas 0.23

Warning: I’ve experienced an incompatibility issue with an older Python 3.5.

Pandas 0.23 - a major release from 0.22.0 and includes a number of API changes, deprecations, new features, enhancements, and performance improvements along with a large number of bug fixes.https://t.co/T0Tftgd9X2

— Python Software (@ThePSF) May 24, 2018

Notables

Suggestive Drawing Among Human and AI: An essay exploring interaction of ML tools with creative design process. As an experiment, they incorporated pix2pix models trained on different data domains into canvas drawing tool.
web: https://t.co/iW8ZzlJfgY
pdf: https://t.co/wqbQYWLc1V pic.twitter.com/j0KwNySQpI

— hardmaru (@hardmaru) May 24, 2018

Nice paper from (@ukhndlwl , He He, Peng Qi and @jurafsky) on exploring what kind of info English LSTM Language Models pay attention to.https://t.co/rJD3FnrWNV

— (((λ()(λ() 'yoav)))) (@yoavgo) May 24, 2018

"Pushing the bounds of dropout" (preprint): <https://t.co/LxgP9749u2>. See dropout in a new light (as a family of models), turned inside out (choose a model at evaluation) and upside down (deterministic dropout is the best modulo unexpected regularization).

— Gábor Melis (@GaborMelis) May 24, 2018

Announcing the @Netflix Research Website #AI #DataScience #MachineLearning https://t.co/HP1pPos4Nj pic.twitter.com/C1L0h1qlps

— KDnuggets (@kdnuggets) May 24, 2018

There is no better way to finish off some Bayesian optimisation than with a bit of local optimisation: congratulations to Mark McLeod on his #icml2018 paper! Featuring a novel stopping criterion and IMHO a killer acronym. https://t.co/SSV6gbXuU4 pic.twitter.com/fCEAzzOpcK

— Michael A Osborne (@maosbot) May 23, 2018

Our new preprint on prediction versus inference:

Excellent p-values do not guarantee successful out-of-sample predictionshttps://t.co/Xb4cfzqZ4o@dngman @bttyeo @DaniSBassett @KriegeskorteLab @Montreal_AI @shakir_za #MachineLearning #rstats pic.twitter.com/Cn0Y1U7DiB

— Danilo Bzdok (@danilobzdok) May 22, 2018

Miscellaneous

Facebook is on a streak of 67 acquisitions unchallenged by antitrust authorities, which sounds impressive until you compare with Amazon at 91, and Google at 214.

— Tim Wu (@superwuster) May 24, 2018

Technical Empathy - the ability to see the system from the point of view of the caller of your code, not just the point of view of your code

— Michael Feathers (@mfeathers) January 26, 2015

This happens to apply to AI too: there are those who are interested in AI as a means to understand how intelligence works -- find out the nature of what we are -- and there are those who seek to create AI because they see it as a powerful, world-changing technology

— François Chollet (@fchollet) May 24, 2018

Facebook launched a database of political ads. It's incomplete. For example, it didn’t catch, uh, this one from @Mccallforall:
https://t.co/3hTAHc1E6l

— ProPublica (@ProPublica) May 25, 2018

@ceshine_en

Inpired by @WTFJHT