Happy GDPR day everyone! May all your data science be compliant.
— Michael Dewar (@mikedewar) May 25, 2018
Gym Retro
Releasing Gym Retro — 1000+ games for reinforcement learning research, plus an integrator tool to add your own classic games. https://t.co/V5unhRknY4 pic.twitter.com/kRbtvsOubi
— OpenAI (@OpenAI) May 25, 2018
OpenAI released Gym Retro. Supporting potentially over 1000 environments, will be useful for studying generalization. Should be more stable compared to Universe ... https://t.co/oOIdjvDVij pic.twitter.com/2U5O6xMjSL
— hardmaru (@hardmaru) May 25, 2018
Keras MXNet Backend
“Keras gets a speedy new backend with keras-mxnet” by Aaron Markham https://t.co/zRZmxTKHGn
— SandeepKrishnamurthy (@skm4ml) May 23, 2018
Bernoulli data
Lots of people see a picture like this, which shows how the standard deviations of Bernoulli data go down as the probabilities head towards 0 or 1, and think "inference will be easier for those cases." pic.twitter.com/3WqDMhE1kZ
— John Myles White (@johnmyleswhite) May 25, 2018
But if you're using the Central Limit Theorem to work with this kind of data, this other plot of the skewness of Bernoulli data over the same range should remind you that even while one aspect gets better, another aspect gets worse. pic.twitter.com/uebABhYxmu
— John Myles White (@johnmyleswhite) May 25, 2018
also the c.v. is higher near zero and one, so need more samples to get same % accuracy in parameter estimate.
— Cian O'Donnell (@cian_neuro) May 25, 2018
Price of Lego Toys
This week's #KernelAwards winner uses a generalized boosted regression model to predict the price of Lego toys: https://t.co/XAJ1VfO2ls pic.twitter.com/td52H01JVS
— Kaggle (@kaggle) May 25, 2018
Calling out Andrew Ng
(Part of a long thread. Please click on the tweet to read the full thread.)
And how can the public trust scientists if time and time again they are presented with hype instead of science? /5
— Lior Pachter (@lpachter) May 24, 2018
Notables
As a non-native English speaker, I'm always interested in studies that seek to better understand l2 acquisition. This paper does a great job. It shows that cognates (words with the same etymological origin) influence the English of non-native speakers. https://t.co/WebVtLFoWj
— Sebastian Ruder (@seb_ruder) May 25, 2018
This is a super useful paper that we need more of: Better ImageNet models are not necessarily better feature extractors (ResNet is best); but for fine-tuning, ImageNet performance is strongly correlated with downstream performance. https://t.co/MrkX4yYgHn
— Sebastian Ruder (@seb_ruder) May 25, 2018
AutoAugment: Learning Augmentation Policies from Data. They apply automated search to find data augmentation strategies that perform well on the validation set, and achieve SOTA test errors of 1.48% on CIFAR10, 10.69% on CIFAR100, 16.46% on ImageNet Top-1. https://t.co/2zRPSY2fdz pic.twitter.com/es5yLtRrvl
— hardmaru (@hardmaru) May 25, 2018
My paper on stochastic phenomena now out in @Ecology_Letters! You can also explore the code appendix in an interactive @RStudio session using @mybinderteam. Check out the binder in the compendium: https://t.co/mMLOpeSFxs https://t.co/6MqgVEciY3
— Carl Boettiger (@cboettig) May 24, 2018
Miscellaneous
50+ Useful #MachineLearning & Prediction APIs, 2018 Edition https://t.co/m7nI39OGj3 pic.twitter.com/zMSPeLmM2P
— KDnuggets (@kdnuggets) May 25, 2018
ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities https://t.co/nYwdiFFrXW
— Python Trending (@pythontrending) May 25, 2018
Things to keep in mind when reading research papers:
— Thomas Wolf (@Thom_Wolf) May 25, 2018
-papers are biased towards using complex models
-papers from well-funded labs are biased towards using the biggest datasets on the biggest machines
-papers from high-profile orgs don't have inherently better ideas (but more PR)
Really interesting post on open source development regarding great contributions from volunteers vs sustained, focussed attention, "many problems where forty people each putting in one hour a week are helpless, but that can easily be solved by one person working forty hours. " https://t.co/NeaTHTKb8D
— Sebastian Raschka (@rasbt) May 26, 2018
“Unobtrusive features inside train stations are designed to unconsciously manipulate passenger behavior, via light, sound, and other means. Japan’s boundless creativity in this realm reflects the deep consideration given to public transportation.” https://t.co/DT7Fe1Poa3
— hardmaru (@hardmaru) May 25, 2018