Today we have a spell checker based on word vectors, and noising and denoising natural language to get a corpus for grammar correction.
Uploaded layout for a single post. Clicking into a post will have its headers properly displayed now.
Spell Checking
Fantastic work from @fastdotai student @er214 showing how to create a spelling checker based on word vectors.
— Jeremy Howard (@jeremyphoward) May 24, 2018
Includes link to full jupyter notebook code walk-thru.https://t.co/2F9wROkVrn pic.twitter.com/r8V7OdFQCP
Want to sound like a pretentious bore?
— Jeremy Howard (@jeremyphoward) May 24, 2018
Well now you can, thanks to this adroit scholarly monograph from the erudite @er214, introducing the "pretentiousness vector".https://t.co/y3GTEcHr84 pic.twitter.com/TgwutK1EHZ
Gramma Correction
How do you get a good corpus for grammar correction via translation? By Noising and Denoising Natural Language. @ziangx1e, Guillaume Genthial, @stan_xie, @AndrewYNg & @jurafsky #NLProc #NAACL2018 https://t.co/815ZNcpd6u pic.twitter.com/bp6HnsTsL5
— Stanford NLP Group (@stanfordnlp) May 24, 2018
Overfit with One Parameter
Tired of over-fitting your data with neural network models that have millions of parameters? Now you can over-fit with just one parameter: newly accepted paper derives a simple, one-parameter equation that can fit any scatter plot to a given precision https://t.co/TI2BRHP0mI
— steve piantadosi (@spiantado) May 23, 2018
Causal Inference and do-Calculus
Judea Pearl claims all we do in ML is curve fitting. I wrote this post to explain that claim and introduce the basics of causal inference to ML folks.
— Ferenc Huszár (@fhuszar) May 24, 2018
Machine Learning beyond Curve Fitting: An Intro to Causal Inference and do-Calculushttps://t.co/1osm0VcaaR
Until now, most of the datasets only consists of X, y variables (like MNIST).
— Daniel Jiwoong Im (@Daniel_J_Im) May 24, 2018
We ought to start creating (X,z,y) datasets in order to apply Do-Calculus or further extend to casual inference. (Collecting all z is impossible in general, so under the certain restricted setting of z) https://t.co/R7bFQASiCE
Tensorflow Eager Execution
⚡️ “Tensorflow eager execution in 12 tweets”https://t.co/WAGCKTr9yg
— Martin Görner (@martin_gorner) May 23, 2018
Measuring Schedule Strength in Sports
Slides from my talk at the Fields Sports Analytics workshop on measuring schedule strength in sports https://t.co/wUYtAHFufk pic.twitter.com/ecJwxNglVk
— Michael Lopez (@StatsbyLopez) May 24, 2018
MURA dataset
MURA (musculoskeletal radiographs) is a large dataset of bone X-rays.
Can your AI model detect abnormalities in bone X-rays as well as a radiologist?
— Andrew Ng (@AndrewYNg) May 24, 2018
My @Stanford lab just released a new dataset, MURA. Join our deep learning competition to see how your model compares: https://t.co/sWSklQ9ykU@pranavrajpurkar @jeremy_irvin16 @mattlungrenMD
Translate
Translate is an #opensource project for developing machine translation models that can be trained in @PyTorch and exported to #Caffe2 for production using ONNX. https://t.co/gqsPCUCru5
— Deep Learning London (@deeplearningldn) May 24, 2018
Translation is an amazing triumph of machine learning and language research. It made the world smaller, enabling commerce and improving communication, essential for peace. Kudos to @facebook for opening this tech so others in Africa etc can tune it to their needs. Bravo @ylecun https://t.co/2DQaocvYOf
— Nando de Freitas (@NandoDF) May 24, 2018
Pandas 0.23
Warning: I’ve experienced an incompatibility issue with an older Python 3.5.
Pandas 0.23 - a major release from 0.22.0 and includes a number of API changes, deprecations, new features, enhancements, and performance improvements along with a large number of bug fixes.https://t.co/T0Tftgd9X2
— Python Software (@ThePSF) May 24, 2018
Notables
Suggestive Drawing Among Human and AI: An essay exploring interaction of ML tools with creative design process. As an experiment, they incorporated pix2pix models trained on different data domains into canvas drawing tool.
— hardmaru (@hardmaru) May 24, 2018
web: https://t.co/iW8ZzlJfgY
pdf: https://t.co/wqbQYWLc1V pic.twitter.com/j0KwNySQpI
Nice paper from (@ukhndlwl , He He, Peng Qi and @jurafsky) on exploring what kind of info English LSTM Language Models pay attention to.https://t.co/rJD3FnrWNV
— (((λ()(λ() 'yoav)))) (@yoavgo) May 24, 2018
"Pushing the bounds of dropout" (preprint): <https://t.co/LxgP9749u2>. See dropout in a new light (as a family of models), turned inside out (choose a model at evaluation) and upside down (deterministic dropout is the best modulo unexpected regularization).
— Gábor Melis (@GaborMelis) May 24, 2018
Announcing the @Netflix Research Website #AI #DataScience #MachineLearning https://t.co/HP1pPos4Nj pic.twitter.com/C1L0h1qlps
— KDnuggets (@kdnuggets) May 24, 2018
There is no better way to finish off some Bayesian optimisation than with a bit of local optimisation: congratulations to Mark McLeod on his #icml2018 paper! Featuring a novel stopping criterion and IMHO a killer acronym. https://t.co/SSV6gbXuU4 pic.twitter.com/fCEAzzOpcK
— Michael A Osborne (@maosbot) May 23, 2018
Our new preprint on prediction versus inference:
— Danilo Bzdok (@danilobzdok) May 22, 2018
Excellent p-values do not guarantee successful out-of-sample predictionshttps://t.co/Xb4cfzqZ4o@dngman @bttyeo @DaniSBassett @KriegeskorteLab @Montreal_AI @shakir_za #MachineLearning #rstats pic.twitter.com/Cn0Y1U7DiB
Miscellaneous
Facebook is on a streak of 67 acquisitions unchallenged by antitrust authorities, which sounds impressive until you compare with Amazon at 91, and Google at 214.
— Tim Wu (@superwuster) May 24, 2018
Technical Empathy - the ability to see the system from the point of view of the caller of your code, not just the point of view of your code
— Michael Feathers (@mfeathers) January 26, 2015
This happens to apply to AI too: there are those who are interested in AI as a means to understand how intelligence works -- find out the nature of what we are -- and there are those who seek to create AI because they see it as a powerful, world-changing technology
— François Chollet (@fchollet) May 24, 2018
Facebook launched a database of political ads. It's incomplete. For example, it didn’t catch, uh, this one from @Mccallforall:
— ProPublica (@ProPublica) May 25, 2018
https://t.co/3hTAHc1E6l