How big data and data analytics is making sports more intelligent

Posted On 1:14:00 PM // Leave a Comment

How big data is making sports more intelligent

From trainers and athletes to businesses, big data analytics can make a difference in improving efficiency, accuracy and profitability in sports

Advanced technologies have influenced every aspect of our lives, and sport is no exception to this.

More than 11,000 athletes from 207 national olympic committees participated in the world's most popular sports competition -the Olympics -held in Rio this year.

To win the maximum number of medals, a range of strategies were being implemented by participating countries by analysing historical data available and forward looking performance predictions.

From trainers and athletes to businesses, big data analytics can make a difference in improving efficiency, accuracy and profitability in sports.

Advances in the capture, storage and analysis of data are revolutionising every aspect of the game and is poised for a major breakthrough in sports.

Data patterns allow teams to perform better and alter their in-game decision-making based on what they are seeing. Here are a few ways how big data is making sports more intelligent:

Recommend winning strategy

Coaches and players are leveraging big data analytics to better understand the performance of their team vs opponent.

They use advanced devices like Google glass, GPS trackers, RFIDs and sensors to track every aspect of the live game, including player movement, distances and speeds, delivering data to the intelligent analytical system in fractions of a second.

Analytical engines then process the data using algorithms and can recommend the strategy to win the game.

Wearable gadgets & sensors

Wearable technologies play a significant role in all olympic sports. In order to keep players at the peak of physical fitness, teams are analysing data received from sophisticated sensors and wearable gadgets.

Data received from devices like heart rate monitors, accelerometer, etc can influence athlete tactics positioning in real time and potentially prevent injuries.

Provides real-time data to sports channel

It plays a significant role for sports production companies and provides commentators with relevant, real-time data, replays, game data facts that help spectators engage with the event, in this case the Rio Olympics.

The goal is to leverage data from live games for the future. There is no doubt data and analytics have had an outsized impact on sports and since its introduction to the olympic games, we have seen increasing medal counts for smaller countries and a closing of gap between competing nations and continents.

The Rio Olympics will undoubtedly raise the bar in terms of how NOCs use data to decide their competitive strategies, thereby maximising ROI.
Courtesy Absolute Data Analytics

Hack This: How to Consult Google's Machine Learning Oracle

Posted On 5:36:00 PM // Leave a Comment

Hack This: How to Consult Google's Machine Learning Oracle

Machine learning and its artificial intelligence parent are probably most often regarded by regular-ass people as kind of opaque and esoteric subjects. Or even just tech buzzwords, which is a shame because it doesn't have to be like that. These things are just tools and as tools they can be employed for extremely complex, inscrutable-seeming tasks found in fields like neuroprosthetics or machine perception, or they can be used for everyday things like classifying spam.

In other words, machine learning doesn't have to be brain surgery, though it can beuseful for that. At the same time, getting into something like Google's TensorFlow open-source machine learning library is pretty daunting. Fortunately, within its folio of cloud services Google offers an extremely accessible machine learning platform known as the Prediction API. It's been called a machine learning "black box" because it hides many of the inner workings of its algorithms, offering instead a clean and very simple interface for creating machine learning models from training data and then using those models to made predictions from new data.

You don't even really need to know any code to get going with Prediction, but being able to use the API programmatically greatly increases its power. So, in the guide below, I'm going to explain how to use Prediction mostly just by using Google's browser-based API explorer, but where appropriate, I'll tell you where using code would be useful and where you would start with that.

0.0) WHAT IT MEANS TO PREDICT

Machine learning, in the crudest high-level sense, is taking data and then using that data to create mathematical models of some phenomenon—and then using those models to say useful things about new data. The more data we can feed to a model, the more we can "train" it, the less fuzzy its predictions become.

If I have two data points for a spam classification algorithm, one not-spam email and one spam email, my model isn't going to have very much to say. It will basically be making blind guesses. But with 10 million emails, it's going to start to figure out what is special about the spam emails to make them spam, e.g. what features in a spam email are important in determining its spaminess. The model will eventually be able to classify spam with very little error, basically none. Machine learning depends on quantity and quality of data.

1.0) WHAT CAN WE PREDICT?

We can ask Prediction to predict two very general things:

1.1) CLASSIFICATION:

We can ask Google Cloud "what is this?" We give Google some choices and we tell Google about some observations that have been made about those choices. Then, Google takes all of that and makes a model. We can then give Google some new observations and ask it what those observations are of. Google will return its best guess, and tell us how sure it is of that guess.

So, imagine some data like this:

butterfly, wings, 1 inch, yellow, orange
bird, wings, 5 inches, blue
plane, wings, 300 feet, silver
butterfly, wings, 1.5 inches, orange, red
dog, tail, 24 inches, brown

I want to use that data to be able to predict whether some new animal/thing is a butterfly, bird, plane, or dog.

So, I ask it to do that. And to ask the model a question, I need to provide it will some observations about I'm asking about. With these features below, I'll ask Google what is this?

wings, 3 inches, black, brown

And it will make a prediction. But it won't be very good because we haven't provided very much data.

1.2) REGRESSION:

Google can also give us numbers. This is a different sort of model—a regression model. Say that we take the bank balances of a variety of different people, and we know three things about those people: occupation, gender, age. We fill out a spreadsheet where the data looks something like this (but with a lot more entries):

$2100, student, male, 28
$10,000, lawyer, male, 55
$7005, engineer, female, 33

We feed that into the API and Google will make a model that will predict the balance of someone new, with these properties:

bartender, female, 40

And Google will spit back an actual number. A new number, not a classification. Not a choice among options. That's huge.

2.0) GOOGLE APIS

Prediction is one of many APIs Google offers as part of its cloud platform. These are all basically gateways or interfaces that we can use to access different services, such as Google Maps, Google Translate, or YouTube. We'd normally think of accessing YouTube via, well, YouTube, but there is also a YouTube API where we can access YouTube videos, comments, analytics, and the rest of it as data in a sort of raw form. You can even imagine the actual YouTube site as being only been one possible implementation of that data of many. That's a pretty good way of thinking about APIs, generally—an underlying interface offering some useful service that can be implemented in any number of different ways.

To use the Prediction API, you first need to register a Google Cloud Platform projecthere. Then, you need to enable billing for the project. Using Google Prediction is free until a certain threshold of use is met, and then it's not. There's pretty much no way you're going to hit that threshold here, but Google still needs you to enable billing.

OK, next you need to enable the Prediction and the Google Cloud Storage APIs on the project you just created. Do that here.

2.1) UPLOADING DATA

Assuming you're square with Google per the above instructions, we can actually get to the machine learning. We'll need some data in this general format.

item, feature 1, feature 2, feature 3, feature 4 ...

The "item" is the thing that the machine learning model is actually learning about. In this row of data, it's learning that some entity existed that had these four characteristics, or features. Given a lot of rows with a lot of features, it will get better and better at saying what sort of entities new collections of features correspond to. You could also think of the "item" here as a label that we assign to a certain collection of features.

We need to actually find some data now. There's a load of sample datasets out there, many of which are archived at the University of California, Irvine's Machine Learning Repository. Note that they're not all already in the right format. In many cases, the label is at the end of the row, not the beginning. This isn't too hard to fix programmatically but it's a bit beyond the scope of the current Hack This.

I did find one that's just about perfect. It has to do with fuel consumption given different sorts of automobile features. A tiny sampling looks like this:

18.0 8 307.0 130.0 3504. 12.0 70 1 "chevrolet chevelle malibu"
15.0 8 350.0 165.0 3693. 11.5 70 1 "buick skylark 320"
18.0 8 318.0 150.0 3436. 11.0 70 1 "plymouth satellite"

The first column is the actual miles per-gallon, while the following rows correspond to cylinders, displacement, horsepower, weight, acceleration, model year, origin, and model name. In that order.

Given the 398 entries in the dataset, we should be able to the predict fuel efficiency of a new car based on some or all of these features. You can look at it here and then you should download it as a text file (.txt).

First, open the file you just downloaded up in a text editor. It doesn't matter which one really, just so long as it has a find-and-replace function. You'll see a file full of neat columns that are separated by spaces and tabs rather than commas. We need commas, alas. I managed to fix this in about a minute just by finding various quantities of spaces and then replacing them with a comma. You'll need to find and replace a tab for the last one. A big part of using datasets winds up being cleaning and formatting them, but this was pretty easy.

Now that we have the dataset on our computer, we need to upload to Google. This is why we authorized Google Cloud Storage above. First, head over to the Cloud Storage browser, which you'll find here. Once you're at the Storage page, you're going to create a "bucket," which is pretty much what it sounds like: a virtual receptacle that you can chuck files of most any type. And you have to give your bucket a name, which has to be unique across the whole of Google Cloud, so you might have to get creative. I managed to snag "fuel-efficiency," sorry.

Once you have a bucket, go ahead and upload your .txt file to that bucket. Now you have a unique location within cloud storage that you can refer back to. It's referenced as "bucketname/filename." Simple enough.

3.0) TRAINING DAY

Next, we get to actually make a machine learning model. For machine learning laypeople like us, this is what's cool about the Prediction API—we can almost completely outsource the actual guts and gears of machine learning to Google.

To actually access the API itself, we're going to use Google's API Explorer. This is a browser-based interface that we can use for interacting with APIs without writing actual code. All we have to do is fill out some stuff in a form and the Explorer will put it together into a proper API request and send it without us having to really deal with anything. This is handy, but it's also pretty limiting.

To get there, navigate from the Google cloud console (the general interface within which you've been doing all of this stuff) to the API manager and then click on the Prediction API within the list. You'll get a page that looks like this one:

Click on the "Try this" link and you'll be directed to the API Explorer.

You'll next see a list of services. Pick the "insert" one, which will direct you to a page that looks like this:

Give it the name of your project (which you created in the beginning), and then click in the "request body" field. It'll give you a dropdown. We need to make up an id for the model we're about to create and then we need to tell it where our data is. For me, it looks like this:

Click execute, and you should get a "200" reply, indicating that the request didn't have any errors. It will also give you a URL for your new model. This the "selfLink."

4.0) PREDICT!

And now the moment we've all been waiting for. To make a prediction, we're going to use the same API Explorer functionality. Head back to the page listing all of the Prediction API services, and now instead of picking insert, pick predict.

So, go ahead and give it your project name again and then the model ID you created in the last step. From the dropdown in the request body field, pick input and then, from the new dropdown, pick csvInstance. Maybe you can guess what it wants: comma-separated values. These values describe some new var that we want Google to predict the fuel efficiency of. I'm going to do this for my own vehicle because it's probably easier than trying to make some data up.

This is what I'm feeding it:

Here's what Google predicted:

"outputValue": "19.181282"

Which is a bit low, but I also fudged my figures a bit.

5.0) USING CODE

Using the Prediction API via code rather than the API Explorer is a pretty simple matter. In Python, making a prediction based on an existing model (what we just did) would look like this (the actual data is from some other project, so don't worry about it):

data = '11.1,1.0,2.0,19.1,98,4,2,2.5,37,2.0,4.0,1.0,2.0,670'
prediction = api.trainedmodels().predict(project='your project id here', id='your model id here', body={
'input': {
'csvInstance': data
},
 }).execute()

Easy enough, right? The tricky part actually has to do with user authentication, which is neccessary because using this API could potentially cost someone money if some usage limits were hit (that are well beyond what we've just done). When money is involved with Google's cloud services, you have to use authentication. This is easy in the API Explorer, but doing it in code I have a hard enough time explaining to myself let alone a bunch of strangers.

6.0) THE FUTURE

I kind of think of the Prediction API as inspiration to go forth and really learn the nuts and bolts of machine learning—or just to think of cool machine learning ideas—but this could have all kinds of out-of-the-box applications for anything from weird art projects to analyzing website traffic. I'm using it to analyze data from sound files recorded of different background environments. Eventually, I want a tool that can take ambient sound and make predictions about where it's from. Prediction makes this easy.

As a final note, some of this can be tricky and you might break things once or twice. Maybe you try and give it a file with the wrong formatting or holes where some data should be. In dealing with huge datasets, this is potentially a huge chore. In a lot of cases, Excel or Google Sheets can help with this part, but expect some trial and error, generally. Predicting the future is worth it. Courtesy of Motherboard.Vicehttp://motherboard.vice.com/read/hack-this-how-to-consult-googles-machine-learning-oracle-2

TOPICS: how-tos, hack this, machine learning, prediction, google cloud, google app engine, cloud computing, api, artificial intelligence

Be Warned! The Seven Definite Ways Facebook’s Big Data AI Algorithm Change Will Affect Marketers And Publishers

Posted On 11:30:00 AM // Leave a Comment

The social network is a fact of life for any company that creates content. And the new reality will require lots of adjustment.

The day many marketers and publishers have dreaded has arrived: Facebook is changing its algorithm to send less traffic to content sites.

In a blog post this morning, the social giant announced it will increasingly prioritize posts shared by friends and family over those from publishers, brands, and other pages.

"The growth and competition in the publisher ecosystem is really, really strong," Adam Mosseri, Facebook’s vice president of product management, told the New York Times. "We’re worried that a lot of people using Facebook are not able to connect to friends and family as well because of that."

The move doesn’t come as a total shock. Research by SocialFlow earlier this month found that the reach of publisher stories had already dropped by 42%.

Facebook doesn’t announce its algorithm changes unless they’re going to have a big impact, and this one will. After all, 40% of publisher traffic comes from Facebook, according to recent research from analytics company Parse.ly.

I spend my life obsessing over Facebook’s impact on marketing and media. As this news broke this morning, seven big thoughts came to mind:

1. SOME ARE SAYING THIS WON’T AFFECT PUBLISHERS THAT MUCH. IT WILL.

As you may have seen on Twitter, there’s an easy way to downplay the algorithm change. If the posts of "friends and family" will get top ranking in the News Feed, then publishers just need to get those folks to share their stories.

But if you’re a publisher with a massive Facebook presence like BuzzFeed or Vox, a big reason people share your stories on Facebook in the first place is because they see them in their News Feed after you post. In other words, publishers’ Facebook posts are the seed that grows into a giant tree of traffic. (BuzzFeed’s Pound technology does a great job of showing how this works.) Fewer seeds means much, much less traffic.

2. PUBLISHERS WILL HAVE TO FOLLOW MARKETERS AND PAY FOR TRAFFIC.

When Facebook slashed brand reach on Facebook three years ago, marketers went through the seven stages of algorithm grief:

Shock
Denial
Anger
Bargaining
Depression
Testing
Buying a ton of Facebook ads

As publishers stare at their declining reach, Facebook’s dashboard will offer a helpful suggestion: "Pay $100 to reach 18,000–24,000 people with this post." Over time, they’ll probably have to bite the bullet and open their wallets.

3. AD COSTS ARE GOING TO GO UP.

The cost of promoting content on Facebook has steadily risen as more people spend money to promote ads on the social network.

As publishers reach that seventh stage of grief, they’re going to buy more Facebook ads to promote posts. That increase in demand will likely raise supply-side costs. So expect the cost per engagement of your Facebook ads to go up.

4. THE INSTANT ARTICLES EXPLOSION WILL SLOW.

One of Facebook’s fastest growing features over the past year has been Instant Articles, the product that lets brands and publishers post articles directly to Facebook. They’ve been a hit because they generally get prioritized higher in the News Feed than links that drive back to publisher sites, and they command a similar CPM.

Instant Articles will likely take the same hit as all other publisher posts, but my bet is that this algorithm change will make a lot of publishers more skeptical about giving up control of their content directly to Facebook. Don’t be surprised if a significant number of publishers pull back their Instant Article usage and refocus on driving readers to their owned sites.

5. EVERYONE MIGHT WANT TO SLOW THEIR ROLL ON VIDEO TOO.

Most publishers have been reorganizing their strategies to focus on video.Mashable, for instance, laid off about 30 staffers this spring in a "strategic shift toward video." A big reason that marketers have been so eager to jump on the video bandwagon is the massive view numbers they get from autoplay video in the Facebook feed (even if those view counts were ridiculously inflated.) Marketers have been jumping on the video bandwagon as well.

In many ways, Facebook created the illusion that its feed provided an infinite amount of attention for video. We were trapped in a fantasy of Mark Zuckerberg as Oprah, screaming, "You get a view! And you get a view! And you get a view!" But now the holiday special might be over, meaning that everyone needs to think hard about how they get real value out of their video investments.

6. THIS COULD HURT NATIVE ADVERTISING.

Last week, I wrote about Facebook’s new rule, which requires publishers to tag a brand in their Facebook posts when they share branded content, like the Onion did here when sharing a native ad it made for Firehouse Subs:

7. THE FACEBOOK ECHO CHAMBER WILL ONLY GROW.

Chalk up a win for the Facebook echo chamber. Facebook wants to show us more content from our like-minded friends and family, which, as researchers have shown, makes us more narrow-minded.

But publishers offer—at least sometimes—a reprieve from that echo chamber. Sure, the New York Times and CNN might lean left like I do, but they’re a lot more balanced than my Sarah Lawrence classmates, who are still sharing crackpot theories about how Bernie can win the election. This change could further polarize societies that can’t afford to become much more polarized.

This brings up the big question we should all probably ask once we stop freaking out about how we’re going to hit our July traffic goals. Facebook is a media force unlike anything we’ve ever seen. It controls not just the viability of individual media and marketing businesses, but also the primary way many people across the world get their news and stay informed. It has incredible power, whether it wants to or not.

Today, Facebook told us that it wants to keep things all in the family. But what if it’s just creating warring tribes?

The Future Is Already Here, Artificial Intelligence is Reshaping Life On Earth: 101 Examples

Posted On 1:00:00 PM // Leave a Comment

Artificial Intelligence (AI) will have more influence on the lives and livelihoods of young people over the next 20 years than any other factor.

This month I’ve been tracking news headlines to get a sense for how widespread AI applications have become. With a couple news alerts and searches I spotted 101 current applications — no SciFi here, these are tools people are using today. And these aren’t just clever algorithms — they are getting better and smarter the more data they interact with.

Life & Media

1. Mapping apps and satellite view (Google Earth)

2. Speech recognition (TechCrunch)

3. Dating apps (Vancouver Sun)

4. Language translation (Silicon Valley Business Journal)

5. Image recognition (Fei-Fei Li on TED)

6. AI composition and music recommendations (Newsweek)

7. Make reservations at a restaurant. (Techcrunch)

8. Filters content and make recommendations (Hubspot podcast)

9. Write articles (Recode)

10. Optimize website-building (TechCrunch)

11. Manages prayer requests (Deseret News)

12. Track and store the movements of automobiles, trains, planes and mobile phones (IBM)

13. Track real-time sentiments of billions of people through social media (IBM)

14. Beat the best humans in chess and Go (CCTV)

15. Coaching social-emotional relationships (Phys)

Safety & security

16. Security-driven AI systems can easily detect and identify bad behaviors from good behaviors. (Economic Times)

17. Quickly find security vulnerabilities (Defense One)

18. Criminal justice system is increasingly using algorithms to predict a defendant’s future criminality (Propublica)

19. Anomaly detection using machine vision (IBM)

20. Predictive models for crime (IBM)

21. Autonomous aerial and undersea warfighting (Nextgov)

22. Guide cruise missiles (Express)

Industry & Agriculture

23. Optimize crop yield (Amr Awadallah in Forbes)

24. LettuceBot reduces chemical use by 90% (Wired)

25. Driverless tractors (Business Wire)

26. Manufacturers predict which machines will breakdown (Amr Awadallah in Forbes)

27. Smart robots for repetitive jobs from apple picking and sneaker maker, (Techcrunch)

28. Robots develop and share new skills (MIT)

Transportation

29. Driverless cars (Nature) coming to Pittsburgh this fall (Wired)

30. Driverless trucks (TechCrunch)

31. Managing drone traffic (Yahoo)

32. Making bus routes smarter (Shanghai Daily)

33. Improve the efficiency of public transportation systems (IBM)

34. Oil exploration efficiency (Oil Price)

Environment

35. Prediction and management of pollutants and carbon footprints. (IBM)

36. Make data centers, power plants, and energy grids more efficient (Money)

Medicine & Health

37. Digitized health records and all medical knowledge to improve diagnosis (IBM)

38. Performing cohort analysis, identifying micro-segments of similar patients, evaluating standard-of-care practices and available treatment options, ranking by relevance, risk and preference, and ultimately recommending the most effective treatments for their patients (IBM)

39. Power precision medicine (NIH)

40. Reduction of medical errors (H&HN)

41. Study the genetics of Autism (SAC)

42. Analyze genomic sequences to develop therapies (Amr Awadallah in Forbes)

43. Read x-rays better than a radiologist (nanalyze)

44. Genomic editing — which may be the most important (and scariest) item on the list (Time)

45. Use social media to diagnosis depression and mental illness (Tech Times)

Organizational Management

46. Speeding and improving Identity verification and background checking (PE Hub)

47. Monitor employee satisfaction and predict staff turnover (HuffPost)

48. Sorting through stacks of résumés from job seekers. (Propublica)

49. Replace handcrafted rule-based systems (TechRepublic)

50. Smart virtual assistants (Slate)

51. Enterprise tech companies provide deep learning as a service — AI on demand (TweakTown)

52. Humans and robots will increasingly collaborate on problem solving (Quartz)

53. Automated floor cleaning (Slate)

Art & Architecture

54. Organic algorithms in architecture (Greg Lynn on TED)

55. Virtual reality art (Wired)

56. Synthesized music (Newsweek)

Social Services & Infrastructure

57. Timely and relevant answers to citizens (IBM)

58. Predict the needs of individuals and population groups, and develop plans for efficient deployment of resources. (IBM)

59. Prediction of demand, supply, and use of infrastructure (IBM)

60. Mobile phone network services (RCRwireless)

61. Analysis of lead contamination in Flint water (Talking Machines)

62. improve building and city design (Property Report)

63. Poverty map of Africa to improve services delivery (Yahoo)

Finance & Banking

64. Fraud detection (Business Insider)

65. Scan news, spot trends and adjust portfolios (Hubspot podcast)

66. Determine credit scores and qualify applicants (Propublica)

67. Find the best insurance coverage at the right cost (IBM)

68. Deliver personalized service with reduce error rates (Finextra)

69. Handle 30k banking customer services transaction/month (American Banker)

70. Answer 100M financial questions involving complex data (Fast Company)

71. Auto-adjudication of insurance claims (Fast Company)

72. Tax preparation (CFO)

Marketing & Customer Service

73. Power chatbot customer service (CB Insights)

74. Product recommendations (Digital Marketing Blog)

75. Manage White House comments (Techcrunch)

76. Chatbot lawyer contests parking tickets (Guardian)

77. Robot inventory checker (NY Times)

78. Recognise customer behavior and provide predictive customer service (Brand Equity)

79. Predict eBay sales (HeatST)

80. Improve sales funnel conversion (Digital Marketing Blog)

Entertainment

81. Fantasy football picks (Fake Teams)

82. Writing screenplays (Entertainment)

Education

83. Recommend next best learning experience (Forbes)

84. Personalized learning programs (IBM)

85. Intelligent tutoring (Forbes)

86. Provides six trait writing feedback (Hubspot podcast)

87. Digitized the world’s literature enabling search/analysis (IBM)

88. Embedded adaptive assessments promotes competency-based learning (Google’s Jonathan Rochelle in Business Insider)

89. Improve career education (Google’s Jonathan Rochelle in Business Insider)

90. AI is boosting HigherEd persistence with text nudges (Rose Luckin in Times HigherEd)

91. Process intelligence tools identify and visualize opportunities (MIT)

92. Matching teachers and schools (Getting Smart)

93. Bus scheduling (Getting Smart)

Smart Home

94. Smart home control systems (Fortune)

Check out these smart home startups. A lot of these use AI behind the scene to get smarter over time.

Courtesy Tom Van Der Ark