Modern image recognition technology is getting really good at identifying objects. But engineers at MIT CSAIL show us how simply playing with their textures can confuse the AI into thinking an object is something completely different than what it actually is.
Category Archives: Blog
Intel buys chip maker Movidius to help bring computer vision to drones
Intel’s RealSense computer vision platform has been lacking a low-powered way of recognizing what its depth-sensing cameras are seeing — until now. The chip giant is buying Movidius, the designer of a range of system-on-chip products for accelerating computer vision processing.
Movidius supplies chips to drone makers such as DJI and to thermal imaging company FLIR Systems, itself a supplier of DJI. Its chips help computers figure out what they are seeing through cameras like Intel’s RealSense by breaking down the processing into a set of smaller tasks that they can execute in parallel.
There are systems that already do this using GPUs, but those are relatively power-hungry, often consuming tens of watts. That’s not a problem in fixed applications with access to mains electricity, or in cars, which have huge batteries and a way to recharge them. But in drones or other lightweight IoT devices, power consumption needs to be much lower. Movidius aims for a design power of around one watt with its Myriad 2 vision processing units.
Having largely failed to get its Atom processors into smartphones, Intel is looking for ways to lever them into other devices, such as drones.
Josh Walden, senior vice president and general manager of Intel’s New Technology Group, sees potential for Movidius to help it create systems for drones, and also for augmented, virtual and merged reality devices, robots and security cameras, he said in a post to the company’s blog. It’s not just about the chips, he said: Intel is also buying algorithms developed by Movidius for deep learning, depth processing, navigation and mapping, and natural interactions.
RUSSELL KIRSCH SAYS he’s sorry.
More than 50 years ago, Kirsch took a picture of his infant son and scanned it into a computer. It was the first digital image: a grainy, black-and-white baby picture that literally changed the way we view the world. With it, the smoothness of images captured on film was shattered to bits.
The square pixel became the norm, thanks in part to Kirsch, and the world got a little bit rougher around the edges.
As a scientist at the National Bureau of Standards in the 1950s, Kirsch worked with the only programmable computer in the United States. “The only thing that constrained us was what we imagined,” he says. “So there were a lot of things we thought of doing. One of which was, what would happen if computers could see the world the way we see it?”
Kirsch and his colleagues couldn’t possibly know the answer to that question. Their work laid the foundations for satellite imagery, CT scans, virtual reality and Facebook.
Kirsch made that first digital image using an apparatus that transformed his picture into the binary language of computers, a regular grid of zeros and ones. A mere 176 by 176 pixels, that first image was built from roughly one one-thousandth the information in pictures captured with today’s digital cameras. Back then, the computer’s memory capacity limited the image’s size. But today, bits have become so cheap that a person can walk around with thousands of digital baby photos stored on a pocket-sized device that also makes phone calls, browses the Internet and even takes photos.
Yet science is still grappling with the limits set by the square pixel.
“Squares was the logical thing to do,” Kirsch says. “Of course, the logical thing was not the only possibility … but we used squares. It was something very foolish that everyone in the world has been suffering from ever since.”
Now retired and living in Portland, Oregon, Kirsch recently set out to make amends. Inspired by the mosaic builders of antiquity who constructed scenes of stunning detail with bits of tile, Kirsch has written a program that turns the chunky, clunky squares of a digital image into a smoother picture made of variably shaped pixels.
He applied the program to a more recent picture of his son, now 53 years old, which appears with Kirsch’s analysis in the May/June issue of the Journal of Research of the National Institute of Standards and Technology.
“Finally,” he says, “at my advanced age of 81, I decided that instead of just complaining about what I did, I ought to do something about it.”
Kirsch’s method assesses a square-pixel picture with masks that are 6 by 6 pixels each and looks for the best way to divide this larger pixel cleanly into two areas of the greatest contrast. The program tries two different masks over each area — in one, a seam divides the mask into two rough triangles, and in the other a seam creates two rough rectangles. Each mask is then rotated until the program finds the configuration that splits the 6-by-6 area into sections that contrast the most. Then, similar pixels on either side of the seam are fused.
Kirsch has also used the program to clean up an MRI scan of his head. The program may find a home in the medical community, he says, where it’s standard to feed images such as X-rays into a computer.
Kirsch’s approach addresses a conundrum that the field of computational photography continues to grapple with, says David Brady, head of Duke University’s imaging and spectroscopy program in Durham, N.C.
Images built from pixels can show an incredible amount of detail, Brady says. “It’s fun to talk to kids about this because they don’t know what I’m talking about anymore, but the snow on analog television — a block-based imager can reconstruct that pattern exactly.”
But images taken from real life never look like that, Brady says. Typically, they have several large uniform sections — forehead, red shirt, blue tie. This means there’s a high probability that one pixel in an image will look the same as the pixel next to it. There’s no need to send all those look-alike pixels as single pieces of information; the information that’s really important is where things are different.
“I always joke that it’s like Los Angeles weather,” Brady says. “If you were a weatherman in Los Angeles you would almost always be right if you say tomorrow is going to be the same weather as today. So one thing you can do is say, I’m going to assume the next pixel is like this one. Don’t talk to me, don’t tell me anything about the image, until you get something different. A good weatherman in Los Angeles tells you when a big storm is coming. In an image, that’s an edge. You want to assume smoothness but have a measurement system that’s capable of accurately finding where the edges are.”
Where Kirsch uses masks to accomplish that task, researchers today typically use equations far more complex than his to strike the balance between shedding unnecessary information and keeping detail. Pixels are still the starting point of digital pictures today, but math — wavelet theory in particular — is what converts the pixels into the picture. Wavelet theory takes a small number of measurements and turns them into the best representation of what’s been measured. This best estimation of a picture allows a megapixel image to be stored as mere kilobytes of data.
Images: 1) This baby picture, scanned in 1957, was the first digital image. At 176 by 176 pixels, its size was limited by the memory capacity of the computer./NIST. 2) Before transforming the square-pixel image, a close-up of one ear appears as a blocky stack. The variably shaped pixel treatment turns it back into an ear./NIST.
SQUARE PIXEL INVENTOR TRIES TO SMOOTH THINGS OUT
This is an excerpt from Tommy Walker‘s article on ConversionXL
When it comes to online imagery, it’s not so much about having images, as it is about making sure those images to give the visitor a sense of texture, size, scale, detail, context & brand. According to MDG Advertising, 67% of online shoppers rated high quality images as being “very important” to their purchase decision, which was slightly more than “product specific information”, “long descriptions”, and “reviews & ratings”:
It’s All About the Images
Joann Peck & Suzanne B. Shu of UCLA published a study called “The Effect of Mere Touch on Perceived Ownership” that found that when the imagery of an object was vivid and detailed, it increased their perceived ownership of the product.
Moreover, Psychologists Kirsten Ruys & Diedrick Stapel of the Tilburg Institute for Behavioral Economics Research found that imagery has the ability to affect a person’s mood, even when they’re unaware it is happening. In their research, they flashed images across a screen in a manner that made it impossible for participants to be fully conscious of what they were seeing. Participants were then tested on cognition, feelings & behavior, and in the end it was found that their general mood reflected the images they were subconsciously exposed to.
So Why The Hell Do You INSIST On Using Stock Photography?!
Alright, look… I get it. You’re on a budget. You need an image that represents “freedom” or “happiness” or ::shutter:: “corporate synergy”.
You’ve diplomatically explained to the client that they really should be using custom photography, but they insist you find a “better/cheaper representation online.” You’ve also gotten that uneasy vibe they’ll invoke “the customer is always right/I can take my business elsewhere” conversation, if you push too hard.
So you go to iStockphoto or Shutterstock, run a query, and try to find the best representation of whatever vague concept you’ve been given as a part of the brief. You pay, download the stock photo, jury-rig it into your design & look at your work with a mixed sense of pride & shame. But the client LOVES it! (“See, looks like Stock wasn’t so bad after all, was it Mr. Designer?”)
Here’s the problem:
2931 results TinEyeEvery other poor schmuck in every other vertical has used the Exact. Same. Photograph. And if you’re really unfortunate, one of those other schmucks was also be your competitor.
Meet The Everywhere Girl
hr_stockBack in 1996, Jennifer Anderson posed for a stock photo shoot shortly after graduating college. At the time, companies would subscribe to a service & receive their stock photos on a CD-ROM. Trouble was, the companies receiving the CD’s didn’t have an easy way to verify who else was using the photo, and the license for the images was not exclusive – meaning anyone could use them. Within a few years, Jennifer became the face of college girls in what seemed to be every marketing campaign. The most notorious faux pas was in 2004, when PC competitors Dell & Gateway used photos from the same photo shoot in their “Back to School” promotional material.
But did it stop there? Nope. Other companies who ended up photos from Jenn’s stock shoot were:
- H&R Block
AAA Auto Insurance
A series of books about Christianity
A teen chat line
A car stereo store
An actuary website
Jenn’s image became so common online, that there were online communities that were dedicated to reporting sightings of this stock photo model around the web.
Why You Have To Be Careful With How You Use Stock Photos
While Jenn’s story is comical in it’s own right, there are some pretty serious negative connotations for brands inadvertently using the same stock photo to represent the same concept.
Looking at you Customer Service girl
The main problem is what’s called the Picture Superiority Effect, where “concepts are much more likely to be remembered experimentally if they are presented as pictures rather than as words.”
According to Wikipedia, this has to do with Allan Paivio’s “dual-coding theory” that states that mental associations become stronger when they’re presented both visually & verbally (or through text). “Visual and verbal information are processed differently and along distinct channels in the human mind, creating separate representations for information processed in each channel. The mental codes corresponding to these representations are used to organize incoming information that can be acted upon, stored, and retrieved for subsequent use.” This applies to both positive & negative experiences. Considering that nearly 2 million Americans fall victim to online scams a year, and many scam sites lean heavily on low priced stock photography… the odds are not in your favor. We already know from the “The Science of Storytelling & It’s Effect on Memory” article, that when a visitor lands on your site for the first time, everything they see is being processed through their working memory – the hyper-short term memory that pulls information from your long term memory to make judgements on what it sees within milliseconds.
If the stock photo you’re using is at all similar to another website that created a negative experience for the visitor, subconsciously, they’re projecting their negative experiences onto your stock photograph, reducing trust & adding friction to the process.
This is likely the real reason why when Marketing Experiments tested a real photo of their client against their top performing stock photo, they found that nearly 35% of visitors would be more likely to sign up when they saw the real deal. Taken to an extreme, using the wrong stock photography could also result in a form of “mistaken identity.” Though this article isn’t specific to using stock photography, the story of Arizona Discount Movers perfectly illustrates what could happen when the good guys get penalized for something the bad guys did.
Stock photos in & of themselves can be a useful, quick & effective way to communicate your point, but you should probably follow a few steps to make sure you’re getting the most out of stock photography.
Step 1 – See Who Else Is Using That Stock Photo
This is where a tool called TinEye comes in very handy to do a “reverse image search” to see where else that photo has been used: If you get something like “168 results”, take the time to investigate who else has used that image, and how they’ve used it. If they cater to a similar market and/or have a huge reach, find a different stock photo. The last thing you want is to try and be unique by using a photo everyone’s already seen. For added peace of mind, go to Google Images and drag the photo into the search bar. Google will pull up all of the exact instances of that photo, so you can see if there’s anything that TinEye had missed.
Google Image Search Results. Step 2 (optional)- Check To See If You Can Get A “Rights Managed” License. If the image in question hasn’t been used by everyone in the known world, check to see you can keep that way. A rights managed license makes it so you have exclusive use of that image within the markets you specify for a specified time frame.
Rights Managed Time-Frame
Even though these licenses are more expensive, this license is huge insurance against anyone else using your image, thereby preventing an “Everywhere Girl” scenario of your own. To read the rest of the original story on ConversionXL click here: http://conversionxl.com/stock-photography-vs-real-photos-cant-use/
If you must use stock photography, make sure it’s on brand, not grossly overused & do what you can to make it your own. Basic and advanced photo-manipulation tactics can transform stock photos into completely unique pieces; they just take a little more time to create. But also, don’t be afraid to take your own photos either.
It’s amazing how much quality is packed into smartphones and other less expensive camera options. With a little planning & some basic knowledge on how lighting & composition work, you can take unique, high quality photographs that better represent your brand.
Featured image source
The Watermark Project from George Prest on Vimeo.
“How do you change perception of a billion dollar company? Not with advertising but by changing the very interface that made them less than popular in the first place. By changing their product.
This is the first work that R/GA London has done for one of its newest clients, Getty Images.
We’re dead proud of it.” -George Prest
The Watermark Project
Humans and software see some images differently, pointing out shortcomings of recent breakthroughs in machine learning.
By Caleb Garling on December 24, 2014 read the full original article here: TechnologyReview
WHY IT MATTERS
Image recognition algorithms are becoming widely used in many products and services.
Images like these were created to trick machine learning algorithms. The software sees each pattern as one of the digits 1 to 5.
A technique called deep learning has enabled Google and other companies to make breakthroughs in getting computers to understand the content of photos. Now researchers at Cornell University and the University of Wyoming have shown how to make images that fool such software into seeing things that aren’t there.
The researchers can create images that appear to a human as scrambled nonsense or simple geometric patterns, but are identified by the software as an everyday object such as a school bus. The trick images offer new insight into the differences between how real brains and the simple simulated neurons used in deep learning process images.
Researchers typically train deep learning software to recognize something of interest—say, a guitar—by showing it millions of pictures of guitars, each time telling the computer “This is a guitar.” After a while, the software can identify guitars in images it has never seen before, assigning its answer a confidence rating. It might give a guitar displayed alone on a white background a high confidence rating, and a guitar seen in the background of a grainy cluttered picture a lower confidence rating (see “10 Breakthrough Technologies 2013: Deep Learning”).
That approach has valuable applications such as facial recognition, or using software to process security or traffic camera footage, for example to measure traffic flows or spot suspicious activity.
But although the mathematical functions used to create an artificial neural network are understood individually, how they work together to decipher images is unknown. “We understand that they work, just not how they work,” says Jeff Clune, an assistant professor of computer science at the University of Wyoming. “They can learn to do things that we can’t even learn to do ourselves.”
These images look abstract to humans, but are seen by the image recognition algorithm they were designed to fool as the objects described in the labels.
To shed new light on how these networks operate, Clune’s group used a neural network called AlexNet that has achieved impressive results in image recognition. They operated it in reverse, asking a version of the software with no knowledge of guitars to create a picture of one, by generating random pixels across an image.
The researchers asked a second version of the network that had been trained to spot guitars to rate the images made by the first network. That confidence rating was used by the first network to refine its next attempt to create a guitar image. After thousands of rounds of this between the two pieces of software, the first network could make an image that the second network recognized as a guitar with 99 percent confidence.
However, to a human, those “guitar” images looked like colored TV static or simple patterns. Clune says this shows that the software is not interested in piecing together structural details like strings or a fretboard, as a human trying to identify something might be. Instead, the software seems to be looking at specific distance or color relationships between pixels, or overall color and texture.
That offers new insight into how artificial neural networks really work, says Clune, although more research is needed.
Ryan Adams, an assistant computer science professor at Harvard, says the results aren’t completely surprising. The fact that large areas of the trick images look like seas of static probably stems from the way networks are fed training images. The object of interest is usually only a small part of the photo, and the rest is unimportant.
Adams also points out that Clune’s research shows humans and artificial neural networks do have some things in common. Humans have been thinking they see everyday objects in random patterns—such as the stars—for millennia.
Clune says it would be possible to use his technique to fool image recognition algorithms when they are put to work in Web services and other products. However, it would be very difficult to pull off. For instance, Google has algorithms that filter out pornography from the results of its image search service. But to create images that would trick it, a prankster would need to know significant details about how Google’s software was d
IN DEPTH Unlocking information from images By Mary Branscombe December 25th on TechRadar
How machine learning and image recognition could revolutionise search
A machine learning system is capable of writing an image caption as well as a person
Microsoft’s new Sway app: Office isn’t copying paper documents any more
How Kinect and analytics could boost sales in bricks-and-mortar stores
Speech recognition software: top six on the market
Text in documents is easy to search, but there’s a lot of information in other formats. Voice recognition turns audio – and video soundtracks – into text you can index and search. But what about the video itself, or other images?
Searching for images on the web would be a lot more accurate if instead of just looking for text on the page or in the caption that suggests a picture is relevant, the search engine could actually recognise what was in the picture. Thanks to machine learning techniques using neural networks and deep learning, that’s becoming more achievable.
When a team of Microsoft and Facebook researchers created a massive data dump of over 300,000 images with 2.5 million objects labelled by people (called Common Objects in Context), they said all those objects are things a four-year-old child could recognise. So a team of Microsoft researchers working on machine learning decided to see how well their systems could do with the same images – not just recognising them, but breaking them up into different objects, putting a name to each object and writing a caption to describe the whole image.
To measure the results, they asked one set of people to write their own captions and another set to compare the two and say which they preferred.
“That’s what the true measure of quality is,” explains distinguished scientist John Platt from Microsoft Research. “How good do people think these captions are? 23% of the time they thought ours were at least as good as what people wrote for the caption. That means a quarter of the time that machine has reached as good a level as the human.”
Part of the problem was the visual recogniser. Sometimes it would mistake a cat for a dog, or think that long hair was a cat, or decide that there was a football in a photograph of people gesticulating at a sculpture. This is just what a small team was able to build in four months over the summer, and it’s the first time they had a labelled a set of images this large to train and test against.
“We can do a better job,” Platt says confidently.
Machine learning already does much better on simple images that only have one thing in the frame. “The systems are getting to be as good as an untrained human,” Platt claims. That’s testing against a set of pictures called ImageNet, which are labelled to show how they fit into 22,000 different categories.
“That includes some very fine distinctions an untrained human wouldn’t know,” he explains. “Like Pembroke Welsh corgis and Cardigan Welsh corgis – one of which has a longer tail. A person can look at a series of corgis and learn to tell the difference, but a priori they wouldn’t know. If there are objects you’re familiar with you can recognise them very easily but if I show you 22,000 strange objects you might get them all mixed up.” Humans are wrong about 5% of the time with the ImageNet tests and machine learning systems are down to about 6%.
That means machine learning systems could do better at recognising things like dog breeds or poisonous plants than ordinary people. Another recognition system called Project Adam, that MSR head Peter Lee showed off earlier this year, tries to do that from your phone.
Project Adam was looking at whether you can make image recognition faster by distributing the system across multiple computers rather than running it on a single fast computer (so it can run in the cloud and work with your phone). However, it was trained on images with just one thing in them.
“They ask ‘what object is in this image?'” explains Platt. “We broke the image into boxes and we were evaluating different sub-pieces of the image, detecting common words. What are the objects in the scene? Those are the nouns. What are they doing? Those are verbs like flying or looking.
“Then there are the relationships like next to and on top of, and the attributes of the objects, adjectives like red or purple or beautiful. The natural next step after whole image recognition is to put together multiple objects in a scene and try to come up with a coherent explanation. It’s very interesting that you can look in the image and detect verbs and adjectives.”
Making images useful
There are plenty of ways in which having your images automatically captioned and labelled will be useful, especially if you’re a keen photographer trying to stay on top of your image library or a news site looking for the right photograph.
“Indexing your photos by who’s in them is a very natural way to way to think about organising photos,” Platt points out. With more powerful labelling, you can search for objects in images (a picture of a cat) or actions (a picture of a cat drinking) or the relation between different objects in an image. “If I remember that I had a picture of a boy and a horse, I’d like to be able to index that – both the objects of the boy and the horse, and the relation between them – and put them in an index so I can go and search for them later.”
If you’re putting together a catalogue of products, having an automatically generated caption might be useful, but Platt doesn’t see much demand for something that specific. There is a lot of interest from different product teams at Microsoft, he says, but instead of creating captions for you he expects that “the pieces will be used in various products; behind the scenes, these bits will be running.”
Dealing with videos will mean making the recognition faster, and working out how to spot what’s interesting (because not every frame will be). But what’s important here is not just the speed, but the way the kind of understanding that underlies captioning complex images could transform search.
The deep learning neural networks and machine learning systems this image recognition uses are the same technologies that have revolutionised speech recognition and translation in the last few years (powering Microsoft’s upcoming Skype Translator). “Every time you talk to the Bing search engine on your phone you’re talking to a deep network,” says Platt. Microsoft’s video search system, MAVIS, uses a deep network.
The next step is to do more than recognise, and actually understand what things mean.
“Even for text there’s a fair amount of work and that’s where there’s a lot of interesting value, if we can truly understand text as opposed to just doing keyword search. Just doing keyword search gets you a long way, that’s how all of our search engines work today. But imagine if you had a system that could truly understand what your documents were about and truly be an assistant to you.”
The goal, he says, is to “try to truly understand the semantics of objects like video or speech or image or text, as opposed to the surface forms like just the words or just the colours.”
How machine learning and image recognition could revolutionise search
Excerpt By DOUGLAS MACMILLAN
and ELIZABETH DWOSKIN
Most users of popular photo-sharing sites like Instagram, Flickr and Pinterest know that anyone can view their vacation pictures if shared publicly.
But they may be surprised to learn that a new crop of digital marketing companies are searching, scanning, storing and repurposing these images to draw insights for big-brand advertisers.
Some companies, such as Ditto Labs Inc., use software to scan photos—the image of someone holding a Coca-Cola can, for example—to identify logos, whether the person in the image is smiling, and the scene’s context. The data allow marketers to send targeted ads or conduct market research.
Others, such as Piqora Inc., store images for months on their own servers to show marketers what is trending in popularity. Some have run afoul of the loose rules on image-storing that the services have in place.
The startups’ efforts are raising fresh privacy concerns about how photo-sharing sites convey the collection of personal data to users. The trove is startling: Instagram says 20 billion photos have already been shared on its service, and users are adding about 60 million a day.
The digital marketers gain access to photos publicly shared on services like Instagram or Pinterest through software code called an application programming interface, or API. The photo-sharing services, in turn, hope the brands will eventually spend money to advertise on their sites.
Privacy watchdogs contend these sites aren’t clearly communicating to users that their images could be scanned in bulk or downloaded for marketing purposes. Many users may not intend to promote, say, a pair of jeans they are wearing in a photo or a bottle of beer on the table next to them, the privacy experts say.
A screenshot of the Ditto Labs site shows the fire hose of photos that it scans for brands. The site filters photos by categories such as beer. ENLARGE
A screenshot of the Ditto Labs site shows the fire hose of photos that it scans for brands. The site filters photos by categories such as beer. DITTO LABS
“This is an area that could be ripe for commercial exploitation and predatory marketing,” said Joni Lupovitz, vice president at children’s privacy advocacy group Common Sense Media. “Just because you happen to be in a certain place or captured an image, you might not understand that could be used to build a profile of you online.”
In recent years, startups have begun mining text in tweets or social-media posts for keywords that indicate trends or sentiment toward brands. The market for image-mining is newer and potentially more invasive because photos inspire more emotions in people and are sometimes open to more interpretation than text.
Instagram, Flickr and Pinterest Inc.—among the largest photo-sharing sites—say they adequately inform users that publicly posted content might be shared with partners and take action when their rules are violated by outside developers. Photos that are marked as private by users or not shared wouldn’t be available to marketers.
There are no laws forbidding publicly available photos from being analyzed in bulk, because the images were posted by the user for anyone to see and download. The U.S. Federal Trade Commission does require that websites be transparent about how they share user data with third parties, but that rule is open to interpretation, particularly as new business models arise. Authorities have charged companies that omit the scope of their data-sharing from privacy policies with misleading consumers.
‘“Our API only provides public information to a handful of partners intended to help their clients understand the performance of their content on Pinterest.”’
The FTC declined to comment.
The photo sites’ privacy policies—the legal document enforced by law as promises to consumers—vary in wording but none of them clearly convey how third-party services treat user-posted photos.
While Facebook is one of the largest photo-sharing sites, the fact that most of its users restrict their photos’ access with privacy controls has deterred outside developers from mining those images. Developers commonly use Facebook’s API to pull in profile photos of its members but not for marketing purposes.
An Instagram spokesman said its partnerships with developers don’t “change anything about who owns photos, or the protections we have in place to keep our community a safe place.” Flickr said it takes steps to prevent outside developers from scanning photos on its site in bulk.
Pinterest said “our API only provides public information to a handful of partners intended to help their clients understand the performance of their content on Pinterest.”
Spokeswomen for Tumblr and Twitter declined to comment.
Jules Polonetsky, the director of Future of Privacy Forum, an advocacy group funded by Facebook and other tech companies, said users should assume that companies are scanning sites for market research if their photos are publicly viewable.
But the boom in image-scanning technologies could lead to a world in which people’s offline behavior, caught in unsuspecting images, increasingly becomes fodder for more personalized forms of marketing, said Peter Eckersley, technology-projects director for the Electronic Frontier Foundation.
Moreover, the use of software to scan faces or objects in photos is so new that most sites don’t mention the technology in their privacy policies.
Advertisers such as Kraft Foods Group Inc. pay Ditto Labs to find their products’ logos in photos on Tumblr and Instagram. The Cambridge, Mass., company’s software can detect patterns in consumer behavior, such as which kinds of beverages people like to drink with macaroni and cheese, and whether or not they are smiling in those images. Ditto Labs places users into categories, such as “sports fans” and “foodies” based on the context of their images.
Kraft might use those insights to cross-promote certain products in stores or ads, or to better target customers online. David Rose, who founded Ditto Labs in 2012, said one day his image-recognition software will enable consumers to “shop” their friends’ selfies, he said. Kraft didn’t respond to a request for comment.
Ditto Labs also offers advertisers a way to target specific users based on their photos posted on Twitter, though Mr. Rose said most advertisers are reluctant to do so because users might find it “creepy.”
Mr. Rose acknowledges that most people who upload photos don’t understand they could be scanned for marketing insights. He said photo-sharing services should do more to educate users and give them finer controls over how companies like his treat photos.
Beyond image recognition, some API partners employ a process called “caching,” meaning they download photos to their own servers. One of the more common uses of caching is to build a marketing campaign around photos uploaded by users and tagged with a specific hashtag.
The companies don’t mention caching in their privacy policies and they vary in how long developers can store photos on their servers. Tumblr, for example, restricts caching to three days while Instagram says “reasonable periods.”
Some developers have already overstepped the rules set forth by photo-sharing sites. Last month, Pinterest learned from a Wall Street Journal inquiry that Piqora, one of seven partners in its business API program, launched in May, was violating its image-use policy.
Piqora, a San Mateo, Calif., marketing analytics startup, collects photos into a graphical dashboard that help companies such as clothing and accessories maker Fossil Inc. track which of its own products and those of competing brands are most popular. This violated Pinterest’s rules, which restrict partners from using images from the site that were posted by anyone except their own clients.
After Pinterest learned about the violation, the company asked Piqora to discontinue the practice and plans to begin performing regular audits of its business partners, a spokesman for Pinterest said. Fossil didn’t respond to a request for comment.
Piqora co-founder and Chief Executive Sharad Verma says he has removed the ability to view competitors’ images in the dashboard. He also clarified his company’s cached photos policy from Instagram. Rather than keeping photos for an indefinite period of time, Mr. Verna said he will now delete photos from his servers within 120 days.
“We might be looking at doing away with caching and figuring out a new way to optimize our software,” Mr. Verma said.
— Lisa Fleisher contributed to this article.
Write to Douglas MacMillan at firstname.lastname@example.org and Elizabeth Dwoskin at email@example.com
How do you teach a computer how to see?