Modern image recognition technology is getting really good at identifying objects. But engineers at MIT CSAIL show us how simply playing with their textures can confuse the AI into thinking an object is something completely different than what it actually is.
All posts by picturegenome-admin
One of Snapchat’s best loved features is its photo filters, which use GPS data and augmented reality to add interactive “lenses” to your photos and videos. Now, the messaging startup wants to make that offering more powerful—and lucrative.
A patent application published on July 14, titled “Object Recognition Based Photo Filters,” describes lenses and filters that would be based on the picture you’re taking. For example, if you’re snapping a photo of the Empire State Building, you’d be given the option of a King Kong filter in which the ape climbs the building. The application also outlines how Snapchat could push you a free coffee offer after you post a photo of a hot cup of java.
Snapchat has 150 million users who send 10 billion videos a day, and they’ve shown no resistance to using sponsored filters. One by Gatorade during this year’s Super Bowl generated 160 million impressions.
But the deep image recognition software needed for the capabilities described in the patent goes further than what’s been offered to date and could make users uncomfortable. Based on the application, Snapchat would be looking at what you’re sending, where you are, and send you advertisements based on that. Snapchat declined to comment on the application.
The tension between a user’s experience and building an advertising business has been a challenge faced by almost every social media company. Facebook and Twitter have had their ups and down, and so will Snapchat. The company is internally projecting sales of $250-$350 million in 2016, and between $500 million and $1 billion in 2017. Snapchat brought in just $59 million in 2015, according to TechCrunch.
Companies file patent applications that go unused all the time, and this patent has not yet been granted. But the bet is on whether or not consumers (especially young ones, like Snapchat’s core demographic) are willing to sacrifice their privacy for fun and potentially useful products. And, for Snapchat, the answer is the difference between being a hip trendy app and the next Facebook.
Google Launches Shop the Look to Optimize Advertising for Retailers
Advertising to consumers is now a more seamless experience thanks to Google.
Last week, the search engine debuted “Shop the “Look,” a new apparel and home décor experience for its retail advertisers, allowing them to reach more consumers while helping brands increase visitor traffic and boost digital sales.
As more consumers browse and purchase items on their smartphones, it is crucial for retailers to create mobile-friendly brand strategies. According to a recent Google study, 90 percent of mobile users said they aren’t absolutely sure of the specific brands they want to purchase items from when they start shopping.
To help consumers discover products instantly, Google is building on ad experiences, including Showcase Shopping ads and Shopping ads on image search. Both options allow consumers to browse items, compare prices and purchase products without typical digital complications. The new version allows people to explore the world of fashion and shop products directly from curated, inspirational images on google.com.
First, consumers can type a particular wardrobe item, like a little black dress, into Google search. Once they hit enter, a picture of a popular fashion blogger wearing a little black dress, heels and a cross-body bag may pop up on their page. Consumers may then shop for exact or similar products found in the image with a few taps.
Shop the Look images are curated by fashion partners, including Polyvore, that feature content from brands, bloggers, publishers and retailers. Similar to standard shopping ad guidelines on Google, retail advertisers will be charged on a cost-per-click basis. Retailer advertisers interested in shop the look may register with Google Shopping Campaigns.
Intel buys chip maker Movidius to help bring computer vision to drones
Intel’s RealSense computer vision platform has been lacking a low-powered way of recognizing what its depth-sensing cameras are seeing — until now. The chip giant is buying Movidius, the designer of a range of system-on-chip products for accelerating computer vision processing.
Movidius supplies chips to drone makers such as DJI and to thermal imaging company FLIR Systems, itself a supplier of DJI. Its chips help computers figure out what they are seeing through cameras like Intel’s RealSense by breaking down the processing into a set of smaller tasks that they can execute in parallel.
There are systems that already do this using GPUs, but those are relatively power-hungry, often consuming tens of watts. That’s not a problem in fixed applications with access to mains electricity, or in cars, which have huge batteries and a way to recharge them. But in drones or other lightweight IoT devices, power consumption needs to be much lower. Movidius aims for a design power of around one watt with its Myriad 2 vision processing units.
Having largely failed to get its Atom processors into smartphones, Intel is looking for ways to lever them into other devices, such as drones.
Josh Walden, senior vice president and general manager of Intel’s New Technology Group, sees potential for Movidius to help it create systems for drones, and also for augmented, virtual and merged reality devices, robots and security cameras, he said in a post to the company’s blog. It’s not just about the chips, he said: Intel is also buying algorithms developed by Movidius for deep learning, depth processing, navigation and mapping, and natural interactions.
CURATION AND ALGORITHMS
BY BEN THOMPSON
Jimmy Iovine spared no words when it came to his opinion of algorithms during the unveiling of Apple Music:
The only song that matters as much as the song you’re listening to right now is the one that follows this. Picture this: you’re in a special moment…and the next song comes on…BZZZZZ Buzzkill! It probably happened because it was programmed by an algorithm alone. Algorithms alone can’t do that emotional task. You need a human touch. And that’s why at Apple Music we’re going to give you the right song [and] the right playlist at the right moment all on demand.
About Beats 1, the new Apple Music radio station, Iovine added:[It] plays music not based on research, not based on genre, not based on drum beats, only music that is great and feels great. A station that only has one master: music itself.
According to the Apple Music website “Zane Lowe and his handpicked team of renowned DJs create an eclectic mix of the latest and best in music”; then again, if you keep scrolling the page, you’re reminded there is more to Beats 1 than curated music:
Building your own station couldn’t be easier. Just select any song, album, or artist and it will practically build itself. Adjust the mix to hear more songs you know or discover unfamiliar gems. Love a track? We’ll play more like it. The more you fine-tune the station, the more personalized it becomes.
That sounds a bit like an algorithm. So which is more important, and why?
THE RISE OF CURATION
Curation has been all over the news for the past few weeks. At that same keynote Apple introduced Apple News, and while the presentation made it sound a bit like those user-generated radio stations — Craig Federighi introduced it as “Beautiful content from the world’s greatest sources personalized for you” — it turns out that Apple is hiring editors to, in the words of the Apple job posting, “Ensur[e] that important breaking news stories are surfaced quickly, and enterprise journalism is rewarded with high visibility.”Apple News is hardly the only effort in the space: a month previously the New York Times released version 2 of its NYT Now app; the big headline was that the app was now free, but just as interesting was the decision to decrease the number of articles from the New York Times itself and intersperse them with a nearly equal number of articles from other publications with the intent of providing a one-stop curated news experience.
Launch one of these events and you’ll see a visually driven, curated collection of tweets. A team of editors, working under Katie Jacobs Stanton, who runs Twitter’s global media operations, will select what it thinks are the best and most relevant tweets and package them into a collection…They’ll use data tools to comb through events and understand emerging trends, and pluck the best content from the ocean of updates flowing across Twitter’s servers. But human beings will decide which tweets to include.
Lightning hasn’t launched, but Snapchat’s Live Stories have been drawing in huge viewer numbers for some time now; they too are driven by curation: Recode reports that “the company has grown its team of Live Story curators from fewer than 10 people to more than 40 people” since January, and is now producing multiple events per day. Even Instagram is adding curation to its new Explore page.
WHEN CURATING MAKES SENSE
There are two important advantages to curation:
- First, where context is critical to immediately determining how important something is — as is the case with news — human curators are, at least for now, superior to algorithms. Humans are also able to quickly identify that these forty stories are about the same event, and have the taste to decide which is the best option to present
- Taste figures much more prominently when it comes to Apple Music and other similar endeavors. The DJ-focused Beats 1 “radio” station, for example, is clearly intended to make certain songs popular, not simply identify popularity after it is already attained. This in particular is a natural fit for Apple, and is the part of Apple Music I am most intrigued by: the company is most comfortable setting trends, not following them (as is the case with the core streaming service)
It’s possible that algorithms will one day be superior to humans at both of these functions, but I’m skeptical: the critical recognition of context and creativity are the two arenas where computers consistently underperform humans.
THE ALGORITHMIC GIANTS
That said, despite curation’s advantages the two biggest content players of all — Google and Facebook — are pure algorithmic plays. Google News has always been algorithmically driven, but the more important tool for content is Google search itself, which uses the most valuable algorithm in the world to not only find content but to rank it as well. Facebook, meanwhile, is in some respects the exact opposite of Google: rather than responding to an input Facebook proactively selects what you see when you open the app; that selection, though, is also 100% algorithmically driven.
When considering the question of what is better, algorithms or curation, I think this observation that the core Facebook and Google algorithms are actually solving two very different problems is a useful one. Google is seeking the single best answer to a direct query from an effectively infinite number of data points (i.e. the Internet); while the answer it gives is to a degree influenced by the profile Google has built about you, or the various contextual clues surrounding your search, for most queries there is one right answer that Google will return to anyone who searches for the term in question. In short, the data set is infinite (which means no human is capable of doing the job), but the target is finite. Facebook, on the other hand, creates a unique news feed for all of its 1.44 billion users: while Facebook has a huge amount of data,
the amount of information any one user will ever be interested in is finite; what is infinite are the number of targets (which means Facebook could never employ enough humans to do the job). In other words, neither Google nor Facebook are able to rely on curation even if they wanted to, but the reasons that Google and Facebook rely on algorithms differs:
However, as I just noted, these two reasons run in the opposite direction: Google does personalize a bit, but it mostly concerned with one right answer, while any single Facebook user doesn’t care and will never care about the vast majority of Facebook’s data. Presuming this relationship holds, you can actually put the above two graphs together:
This curve is a useful way to think about the aforementioned curation initiatives: curation works best when there is a good amount of data, but not too much, and the goal is a fair bit of personalization, but not on an individual basis.
The Curation-Algorithm curve makes it clear why news is an obvious curation candidate: while a lot of news happens everywhere all the time, it’s still a lot less than the sum total of information on the Internet. Moreover, the sort of news most people care about tends to be relatively widely applicable, which means personalization is useful but only to a degree. In other words, news mostly sits at the bottom of this curve. Newspapers figured this out a long time ago: editors were curators, deciding what went on the front page, what was on page 13, and what was buried completely. It mostly worked, although many editors perhaps became too enamored with “prestige” stories like world news as opposed to truly understanding what readers wanted. Moreover, once the
Internet destroyed geographic monopolies, it quickly became apparent that most newspapers didn’t have the best content on the particular stories they covered; readers fled to superior alternatives wherever they happened to find them and curation gave way to social services like Twitter and Facebook.This is what makes the NYT Now and BuzzFeed News apps so interesting: both accept the idea that their respective publications don’t have a monopoly on the best content, even as both are predicated on the idea that curation remains valuable. Apple News takes this concept further by being completely publication agnostic.
THE TWITTER QUESTION
The current Twitter product, based on a self-curated time-line, doesn’t really fit well on the Curation-Algorithm curve. Power users, through the long and arduous process of following and unfollowing a huge number of people, can ultimately arrive at a highly personalized feed that is relevant to their interests. Beginners, though, are presented with a feed that is nominally about their interests as decided by a torturous first-run experience but which in reality is a stream of mumbo-jumbo.
Project Lightning is clearly focused on hitting the algorithmic sweet spot with event-based “channels”: it’s an obvious move that should have been done years ago. What is perhaps more interesting, though, is whether Twitter ought to pursue an algorithmic feed: I think the answer is “Yes”. While Twitter’s value is its interest graph, its organizing principle to date has been people; an algorithmic feed would help Twitter more effectively bridge that disconnect.
There is one more big reason why tech companies have previously given curation short shrift, and it’s the flipside of Apple’s efforts with Music: it is a lot easier to abscond with responsibility for what you display if you can blame it on an algorithm. Human curation, on the other hand, makes it explicitly clear who is responsible for what is seen by the curating company’s users. The potential quandaries are easy to imagine: will Apple’s News app highlight a story about worker conditions in China?
Will Snapchat’s planned coverage of the 2016 election favor one candidate over the other? Would Twitter have created an “event” around the exit of its CEO?On the other hand, hiding behind algorithms is increasingly untenable as well. For one, algorithms aremade by humans; choosing which story appears in your Facebook feed is the responsibility of Facebook whether they choose it explicitly or implicitly via an algorithm. Google, for its part, hassuccessfully argued that its algorithm is protected free speech, an admission of ultimate responsibility even more profound than the company’s regular algorithmic updates explicitly designed to adjust rankings.Google in particular has a special responsibility. I wrote in Economic Power in the Age of Abundance:
The Internet is a world of abundance, and there is a new power that matters: the ability to make sense of that abundance, to index it, to find needles in the proverbial haystack. And that power is held by Google. Thus, while the audiences advertisers crave are now hopelessly fractured amongst an effectively infinite number of publishers, the readers they seek to reach by necessity start at the same place – Google – and thus, that is where the advertising money has gone.
Ultimately, I see the embrace of curation as a mark of maturation of the technology industry. Today’s technology companies have massive amounts of influence over what people the world over see and consume, and while there is a long ways to go when it comes to transparency about what is seen and why, at least everyone is now being honest about possessing that power in the first place.
This is an excerpt from Washington Post and written by Jessica Contrera May 25
Richard Prince’s Instagram screenshots at Frieze Art Fair in New York. (Marco Scozzaro/Frieze)
The Internet is the place where nothing goes to die.
Those embarrassing photos of your high school dance you marked “private” on Facebook? The drunk Instagram posts? The NSFW snapchats? If you use social media, you’ve probably heard a warning akin to “don’t post anything you wouldn’t want your employer (or future employer) to see.”
We agree, and are adding this caveat: Don’t post anything you wouldn’t want hanging in an art gallery.
This month, painter and photographer Richard Prince reminded us that what you post is public, and given the flexibility of copyright laws, can be shared — and sold — for anyone to see. As a part of the Frieze Art Fair in New York, Prince displayed giant screenshots of other people’s Instagram photos without warning or permission.
The collection, “New Portraits,” is primarily made up of pictures of women, many in sexually charged poses. They are not paintings, but screenshots that have been enlarged to 6-foot-tall inkjet prints. According to Vulture, nearly every piece sold for $90,000 each.
How is this okay?
First you should know that Richard Prince has been “re-photographing” since the 1970s. He takes pictures of photos in magazines, advertisements, books or actors’ headshots, then alters them to varying degrees. Often, they look nearly identical to the originals. This has of course, led to legal trouble. In 2008, French photographer Patrick Cariou sued Prince after he re-photographed Cariou’s images of Jamaica’s Rastafarian community. Although Cariou won at first, on appeal, the court ruled that Prince had not committed copyright infringement because his works were “transformative.”
In other words, Prince could make slight adjustments to the photos and call them his own.
Prince’s 1977 work “Untitled (four single men with interchangeable backgrounds looking to the right),” which is made of photos that previously appeared in print. (Metropolitan Museum of Art, “The Pictures Generation” exhibition)
This is what he did with the Instagram photos. Although he did not alter the usernames or the photos themselves, he removed captions. He then added odd comments on each photo, such as “DVD workshops. Button down. I fit in one leg now. Will it work? Leap of faith” from the account “richardprince1234.” The account currently has 10,200 followers but not a single picture — perhaps so you can’t steal his images in return?
“New Portraits” first debuted last year at Gagosian Gallery on Madison Avenue, the same location where the artist displayed the Rastafarian images he was sued for.
If someone wanted to argue that this collection is not “transformative” enough to be legal, they would have to file a lawsuit on their own. Upon seeing this story, a spokesman from Instagram said:
“People in the Instagram community own their photos, period. On the platform, if someone feels that their copyright has been violated, they can report it to us and we will take appropriate action. Off the platform, content owners can enforce their legal rights.”
Basically, if someone copies your Instagram to an account of their own, the company can do something about it. If they copy your work to somewhere outside of the social network, like a fancy New York gallery, you’re on your own.
Prince appears to be enjoying the controversial attention. He has been re-tweeting and re-posting his many critics.
This week Stephen Wolfram, founder and chief executive of Wolfram Research, announced a new component of the Wolfram Language for programming called ImageIdentify. Wolfram also introduced a new website, dubbed The Wolfram Language Image Identification Project, that demonstrates the language’s new capabilities.
The new site lets you upload images and get inferences and definitions in response. You can provide feedback, which should help it become more accurate. You can hit buttons like “Great!,” “Could be better,” “Missed the point,” and “What the heck?!” After you choose one, the service offers a few more guesses, and a text box where you can type in a tag. Then you can type in your email address, so it can tell you “when ImageIdentify learns more about your kind of image.”
The service uses a trendy type of artificial intelligence called deep learning. It draws on artificial neural networks, which train on a large quantity of information, like pictures, and then make inferences when you give it new information, like a new picture. Big web companies like Facebook, Google, and Microsoftuse deep learning for various purposes, and increasingly smaller companies have been exposing deep learning tools for pretty much anyone to try out.
To get a rough sense of the power of the new Wolfram technology, I decided to put it up against other existing image-recognition systems you can test out on the Internet today, from CamFind, Clarifai, MetaMind, Orbeus, and IBM-owned AlchemyAPI. I chose images from Flickr that seemed to clearly fall into the 1,000 categories used for the 2014 ImageNet visual recognition competition. It was unscientific — just for the sake of curiosity.
What I found is that Wolfram’s new system doesn’t seem to be all that bad. It wasn’t overly conservative or vague, and it didn’t make many obvious mistakes — although it wasn’t as consistently accurate as MetaMind, for one. With time, Wolfram’s technology should improve — especially as people point out its flaws.
Here are 10 of the tests I ran to reach my conclusion.
Wolfram ImageIdentify: tea
CamFind: white ceramic mug
Clarifai: coffee cup nobody tea mug cafe hot ceramic coffee cup cutout
MetaMind: Coffee mug
Wolfram ImageIdentify: magic mushroom
CamFind: white mushroom
Clarifai: mushroom fungi fungus toadstool nature grass fall moss forest autumn
Wolfram ImageIdentify: spatula
CamFind: black kitchen turner
Clarifai: steel wood knife handle iron fork equipment nobody tool chrome
Wolfram ImageIdentify: scoreboard
CamFind: baseball scoreboard
Clarifai: scoreboard soccer stadium football game competition goal group north America match
5. German shepherd
Wolfram ImageIdentify: German shepherd
CamFind: black and brown German shepherd
Clarifai: dog canine cute puppy mammal loyalty grass sheepdog fur German shepherd
MetaMind: German Shepherd, German Shepherd Dog, German Police Dog, Alsatian
Wolfram ImageIdentify: tufted puffin
CamFind: toucan bird
Clarifai: bird one north America nobody animal people adult nature two outdoors
7. Indian cobra
Wolfram ImageIdentify: black-necked cobra
CamFind: brown and beige cobra snake
Clarifai: snake nobody reptile cobra wildlife daytime sand rattlesnake north America desert
MetaMind: Indian cobra, Naja Naja
Wolfram ImageIdentify: strawberry
CamFind: red strawberry ruit
Clarifai: fruit sweet food strawberry ripe juicy berry healthy isolated delicious
Wolfram ImageIdentify: cooking pan
CamFind: gray steel frying pan
Clarifai: ball nobody pan cutout kitchenware north America tableware competition bowl glass
Orbeus: frying pan
AlchemyAPI: (No tags)
10. Shoe store
Wolfram ImageIdentify: store
CamFind: black crocs
Clarifai: colour street people color car mall road fair architecture hotel
MetaMind: Shoe Shop, Shoe Store
Orbeus: shoe shop
This is an excerpt written by Molly McHugh 5/24/15 in Wired
IF YOU’RE LIKE most people on Instagram, you’ll scroll through all 22 filters, carefully consider the nuances of Inkwell vs. Lo-Fi vs. Hudson, and then settle on one of the filters you always use. Oh sure, there are so many filters, but you always go back to your favorites “just because.”
Turns out it isn’t “just because.” There are some specific reasons you rely upon your old faithfuls, and a growing body of science examining how and why people choose filters and how those choices influence others’ reactions to the photo. According to a study out of Yahoo Labs, researchers looked at 7.6 million Flickr photos (many of which originated on Instagram and were uploaded to Flickr) and found “filtered photos are 21 percent more likely to be viewed and 45 percent more likely to be commented on.”
This study is but a drop in a fairly shallow pool: Despite mobile photography’s massive popularity, it’s been largely ignored by academics. “There is little work—scholarly or otherwise—around filters, their use, and their effect on photo-sharing communities,” the Yahoo Labs study explains. That’s due in part to photos being harder than text to analyze, but that shouldn’t be an excuse anymore, especially given the active commenting community on Instagram and other social media.
The Yahoo Labs team is not alone in its fascination. Researchers at Arizona State University have been studying Instagram and its filters since last year. “We were (and continue to be) motivated by the fact that Instagram has received very little attention from the research community,” says one of them, Subbarao Kambhampati “We believe that a careful analysis of Instagram can give us a valuable window into our collective online behavior.”
The Yahoo Study focused specifically on filters, and found people like higher contrast and corrected exposure, and find a warmer temperature more appealing than a cooler one. “Serious hobbyists” use filters only to correct a problem—say, correct the exposure. “More casual photographers” are more likely to manipulate their images with filters or adjustments that make them appear more “artificial,” according to the study.
One particularly interesting part of the research examined just who’s using Flickr. A few years ago, the “Flickr vs. Instagram” debate could be cast as “real camera vs smartphone camera.” That’s no longer the case. The iPhone rules all.
“The iPhone has been the most popular camera for years now,” says David Ayman Shamma, one of the Yahoo researchers. The proliferation of the iPhone and smartphones in general has lead to photography of all kinds becoming a creative outlet for millions. “It’s been the dream since the Kodak Brownie. With it, comes the creative space for many outlets and photographers, from people shooting on vintage film to food bloggers.”
Shamma says profiling Flickr users is a more complex task, because so many people use it for so many things. “Some people on Flickr only want to push their best portfolio pieces from DSLRs, while others publish their daily lives from their iPad camera, and many do a mix of cameras and content.”
No matter what you shoot or where you post it, you will be heartened to hear filter snobbery is dying. Pro shooters who once sneered at Instagram obsessives and their love of Rise, Mayfair, and X-Pro II aren’t quite as judgmental as they used to be (or, perhaps, as we only thought they were). These days, everyone uses filters.
“One of the surprising things to me was the pro set was talking fondly about the filters,” Shamma says. “Not that I thought they’d be snobby and elitist before the study, but I assumed they’d rather use their software tools of the trade over a one-click filter.” If you don’t want to say it, it’s OK, I will: I thought they would be snobby and elitist.
Of course, it’s impossible to talk about filters without talking about Instagram, because it is the world’s most popular photo app (it’s one of the world’s most popular apps, period). It seems there could be a difference between people who use Instagram to take and manipulate photos and those who do so with Flickr, or someone who uses Flickr to edit a photo and then cross-posts it to Instagram (or vice versa).
“There are similarities and differences for sure and we can see them by looking at what’s uploaded to Flickr via the Flickr app and what’s uploaded to Flickr through Instagram,” Shamma says. “The nature photos on Flickr from Instagram show more engagement when they are filtered, so it’s a function of what sub-community you’re speaking to.”
But Shamma says there are unifying factors when it comes to filters, regardless of platform or skill level. Could someone, therefore, use this research to design the perfect filter? In a word, no. “As awesome as something automatic might sound, there really is no silver bullet here,” Shamma says.
That’s because there’s more at play than how a filter looks. In many cases, the act of choosing the filter is equally important. “When we interviewed people for this study, we found that the photographers, regardless of skill level, enjoyed the process of selecting a filter,” he says. This explains why people painstakingly scroll through them all before invariably choosing a favorite. Even if someone could engineer the perfect filter, people wouldn’t want to lose out on seeing their photos transformed by all those filters. The element of choice, the function of looking and choosing, is one reason people so love filters in the first place.
In case you were wondering, another study, in 2014 by Arizona State researcher, identified the most popular Instagram filters. They are, in descending order of popularity, no filter, Amaro, X-Pro II, Valencia, and Rise. Seeing “no filter” is a bit of a shock, given what Yahoo’s study says, but what’s most interesting is that the most popular filters may be the most popular because everyone thinks they’re popular. “These top five filters are actually present in the first seven filters of Instagram GUI at that time,” Kambhampati, explains. “This brings up the possibility that the (accidental?) placement of filters has more to do with their eventual popularity than any conscious photographic choice by the Instagram user.” The same study also found that there are generally only a few categories of Instagram photos: friends, food, gadgets, quote pics, pets, activities, selfies, and fashion.
The researchers are still digging into Instagram, and plan to look at how the social network diffuses information and just what makes an image “go viral.” Such questions have been asked of other social networks, but not Instagram. “We have come up with a number of indirect measures to study diffusion (in particular, by studying the number of ‘likes’ and comments received in terms of the number of hops separating the liking and commenting user from the posting user.” The researchers soon will present a paper showing how we can glean sentiment from Instagram images with the help of “image features and the features from the textual comments.” Perhaps someday soon, we won’t simply know what Instagram filters are popular, but also how they make us feel.
This is an excerpt from TechCrunch and Posted by Sarah Perez (@sarahintampa)
Not everyone was happy with last week’s major revamp of Yahoo-owned photo-sharing site Flickr. A small, but very vocal, portion of Flickr’s user base of 100 million members, immediately took to the forums to lament the fact that the site’s new “auto-tagging” feature was enabled by default, and, worse, that there was no opt-out option provided. But that may now be changing, we understand.Flickr recently introduced a series of upgrades to its service on the web and on mobile designed to make every aspect of photo editing, organization and sharing easier on its service. A couple of the more notable changes were the addition of auto-tagging and new image-recognition capabilities. Combined, these features allow Flickr to identify what’s in a photo, and then automatically categorize it on users’ behalf by adding tags. This, in turn, makes images easier to surface by way of search.
Auto-tagging especially makes sense in today’s highly mobile age, where users take large numbers of photos and most no longer have the time or inclination to carefully group them or categorize them by manually adding tags. Tags, after all, are a holdover from an earlier time – the not-too-distant past before we all carried smartphones in our pockets capable of taking quality photos.
But for many Flickr users, tags are something they still feel strongly about, judging by the forum’s many comments. With over 1,370 replies to the official Flickr post (and growing), these users have been venting their frustration about the addition of auto-tagging. Many of those commenting have actually been fairly conscientious about their tags over the years, and don’t like that Flickr is now adding its own tags to their photos.
In addition, several also complain that Flickr’s auto-tags simply aren’t that accurate. In some cases, those mistakes are somewhat benign – a BMW gets tagged as a Ferrari, for example. But other times, they can be really terrible – as in the case of a user whose Auschwitz photos were incorrectly tagged as “sport,” for instance.
The problem lies with the fact that an algorithmic system of tagging is never going to be perfect – though it is capable of improving over time based on users’ corrections. But some are unwilling to wait for that training process to occur. They just want out. Period.
However, Flickr doesn’t offer an option to disable the auto-tagging at all, which is a rather bold stance to take. And while users can batch edit a group of tagged photos, they can’t edit auto-generated tags. So the only way to edit the auto-generated tags is to go into each photo individually. This is far too time-consuming for most people to manage, which is why so many are upset.
But Flickr tells us that it’s taking the community feedback on the matter seriously, and is evaluating an option that would allow an opt-out of the automated tagging. The option is not yet being built, but it is at least being actively discussed, from what we understand.
The company further explains that auto-tagging is actually a fairly crucial part to the upgraded service, as it is what powers a number of the new features, including the “Magic View,” which helps users organize and share their photos based on topic, as well as the new search tools and other “future features” still in the works. That could explain why Flickr felt strongly enough about auto-tagging to not make it an opt-in option in the first pace, as well as why there’s no “off” switch for the time being.
While likely a large majority of consumers won’t care (or maybe even notice), for those power users and others who rely heavily on Flickr as their main online image repository, adding the “opt-out” option – even as a gesture to the community – would be appreciated.
RUSSELL KIRSCH SAYS he’s sorry.
More than 50 years ago, Kirsch took a picture of his infant son and scanned it into a computer. It was the first digital image: a grainy, black-and-white baby picture that literally changed the way we view the world. With it, the smoothness of images captured on film was shattered to bits.
The square pixel became the norm, thanks in part to Kirsch, and the world got a little bit rougher around the edges.
As a scientist at the National Bureau of Standards in the 1950s, Kirsch worked with the only programmable computer in the United States. “The only thing that constrained us was what we imagined,” he says. “So there were a lot of things we thought of doing. One of which was, what would happen if computers could see the world the way we see it?”
Kirsch and his colleagues couldn’t possibly know the answer to that question. Their work laid the foundations for satellite imagery, CT scans, virtual reality and Facebook.
Kirsch made that first digital image using an apparatus that transformed his picture into the binary language of computers, a regular grid of zeros and ones. A mere 176 by 176 pixels, that first image was built from roughly one one-thousandth the information in pictures captured with today’s digital cameras. Back then, the computer’s memory capacity limited the image’s size. But today, bits have become so cheap that a person can walk around with thousands of digital baby photos stored on a pocket-sized device that also makes phone calls, browses the Internet and even takes photos.
Yet science is still grappling with the limits set by the square pixel.
“Squares was the logical thing to do,” Kirsch says. “Of course, the logical thing was not the only possibility … but we used squares. It was something very foolish that everyone in the world has been suffering from ever since.”
Now retired and living in Portland, Oregon, Kirsch recently set out to make amends. Inspired by the mosaic builders of antiquity who constructed scenes of stunning detail with bits of tile, Kirsch has written a program that turns the chunky, clunky squares of a digital image into a smoother picture made of variably shaped pixels.
He applied the program to a more recent picture of his son, now 53 years old, which appears with Kirsch’s analysis in the May/June issue of the Journal of Research of the National Institute of Standards and Technology.
“Finally,” he says, “at my advanced age of 81, I decided that instead of just complaining about what I did, I ought to do something about it.”
Kirsch’s method assesses a square-pixel picture with masks that are 6 by 6 pixels each and looks for the best way to divide this larger pixel cleanly into two areas of the greatest contrast. The program tries two different masks over each area — in one, a seam divides the mask into two rough triangles, and in the other a seam creates two rough rectangles. Each mask is then rotated until the program finds the configuration that splits the 6-by-6 area into sections that contrast the most. Then, similar pixels on either side of the seam are fused.
Kirsch has also used the program to clean up an MRI scan of his head. The program may find a home in the medical community, he says, where it’s standard to feed images such as X-rays into a computer.
Kirsch’s approach addresses a conundrum that the field of computational photography continues to grapple with, says David Brady, head of Duke University’s imaging and spectroscopy program in Durham, N.C.
Images built from pixels can show an incredible amount of detail, Brady says. “It’s fun to talk to kids about this because they don’t know what I’m talking about anymore, but the snow on analog television — a block-based imager can reconstruct that pattern exactly.”
But images taken from real life never look like that, Brady says. Typically, they have several large uniform sections — forehead, red shirt, blue tie. This means there’s a high probability that one pixel in an image will look the same as the pixel next to it. There’s no need to send all those look-alike pixels as single pieces of information; the information that’s really important is where things are different.
“I always joke that it’s like Los Angeles weather,” Brady says. “If you were a weatherman in Los Angeles you would almost always be right if you say tomorrow is going to be the same weather as today. So one thing you can do is say, I’m going to assume the next pixel is like this one. Don’t talk to me, don’t tell me anything about the image, until you get something different. A good weatherman in Los Angeles tells you when a big storm is coming. In an image, that’s an edge. You want to assume smoothness but have a measurement system that’s capable of accurately finding where the edges are.”
Where Kirsch uses masks to accomplish that task, researchers today typically use equations far more complex than his to strike the balance between shedding unnecessary information and keeping detail. Pixels are still the starting point of digital pictures today, but math — wavelet theory in particular — is what converts the pixels into the picture. Wavelet theory takes a small number of measurements and turns them into the best representation of what’s been measured. This best estimation of a picture allows a megapixel image to be stored as mere kilobytes of data.
Images: 1) This baby picture, scanned in 1957, was the first digital image. At 176 by 176 pixels, its size was limited by the memory capacity of the computer./NIST. 2) Before transforming the square-pixel image, a close-up of one ear appears as a blocky stack. The variably shaped pixel treatment turns it back into an ear./NIST.
SQUARE PIXEL INVENTOR TRIES TO SMOOTH THINGS OUT