Friday, May 21, 2010

Data Mining in the Deep, Dark Social Networks of Patients. Word to Pharma: Caveat Emptor

Yesterday, I posted the following tweet:

PatientsLikeMe blocks "scraper." Is this a trend in SM? Pharma can't mine patient msgs & learn? See http://bit.ly/9lWSIT #fdasm #hcsm

Eventually, that tweet reached the desktop of Jamie Heywood, Co-founder of PatientsLikeMe (PLM), who contacted me to discuss the "richer story" beneath the surface.

But before I get to that, here's how Jamie's partner, Ben Heywood, described what was going on viz-a-viz the "scraper" incident (see here for his full statement):
"Recently, we suspended a user who registered as a patient in the Mood community. This user was not a patient, but rather a computer program that scrapes (i.e. reads and stores) forum information. Our system, which alerts us when an account has looked at too many posts or too many patient profiles within a specified time interval, detected the user. We have verified the account was linked to a major media monitoring company, and we have since sent a cease and desist letter to its executives.

"While this was not a security breach, it was a clear violation of our User Agreement (which expressly forbids this type of activity) and, more significantly, a violation of the community’s trust. Your Account Information (e.g. your names and emails) was NOT in danger of being stolen. It is likely that the forum information that was “scraped” would be sold as part of that company’s Internet monitoring product. In fact, we sell a similar service, PatientsLikeMeListenTM, to our clients so they better understand the voice of the patient."
The important issues for PLM is that the media monitoring company -- probably employed by an unnamed pharmaceutical company -- was not an authentic patient and violated PLM's User Agreement, which states "You may not use any robot, spider, scraper, or other automated means to access the Site or content or services provided on the Site for any purposes."

Recently, I wrote about BzzAgent operatives trolling social media on behalf of pharmaceutical companies (see "Are J&J Agents Trolling for Adverse Events on the Internet?"). One "operative" admitted that he took a survey through BzzAgent for Johnson & Johnson, "which basically was more of a 'contract' where if chosen, I agreed to notify J&J if I became aware of any negative talk about their products."

What this person was doing for J&J was not exactly "scraping" social media patient sites because the monitoring was being done manually by a real person. This may be a neat way of getting around User Agreements of "closed" patient communities like PLM, but it is not nearly as efficient or effective as using software tools (listen the podcast "Aligning Your Message with Patient Needs: How Social Media Can Help" for more insight on that). But, if J&J enlisted the help of thousands of real people to monitor patient communities, it could be pretty effective. BTW, PLM's policy also forbids individuals from using information collected manually or otherwise on PLM "in connection with any commercial endeavors."

Since PLM is using its own "scraper" software to troll its closed communities to create reports for pharma clients (eg, see "PatientsLikeMe Reports High Rate of Adverse Event Reporting Among Its Members"), it has a vested interest in preventing rouge pharma companies from hiring "scraper" agents to mine PLM for the same data it is selling its own pharma clients.

This whole "scraping" incident raises the issue of "Transparency, Openness and Privacy." Anyone can join PLM and claim to be a patient -- we are ALL patients at one time or another. So, when I joined, I was not violating PLM's User Agreement, which also states "To become a member and access the area on this Site reserved for members (the 'Member Area'), PatientsLikeMe requires that you are either (a) a diagnosed patient of the particular community you are joining, (b) a caregiver for a patient eligible to join such community, (c) a health care professional (e.g. doctor, nurse, health researcher, etc.), or (d) a guest as authorized by a PatientsLikeMe member or employee." That includes practically everyone in the known inhabited universe.

Someone "posing as a patient" on PLM is practically an oxymoron (or whatever the term). Therefore, violation of "openness" is not the issue. "Privacy" is also not the issue. PLM members are well aware that whatever they say in PLM's "closed" communities can be revealed to third parties either by PLM itself (de-identified) or by other PLM members (as part of a non-commercial endeavor).

"Transparency" appears to be the major issue here, and one that pharmaceutical companies should be wary of when they hire agents to monitor patient communities. Jamie likened it to a situation in the real world where his church's copper gutters were stolen and the copper resold to people having repairs done on their houses. The end users of the stolen copper may not have known how it was obtained, nor did they probably care. Pharma companies, however, cannot be so blase -- they have a lot more at stake if they accept stolen "copper" and get caught. As Jamie said, you wouldn't want to see that story in the Wall Street Journal (or here on this blog).

The lesson for pharma companies is "caveat emptor" -- when hiring a third party to monitor "closed" patient social networks, be sure they do not violate the policies of these communities and/or do it in an open and transparent way.  For communities like PLM, the only way is to hire PLM to do the scraping for you (ie, see "UCB & PatientsLikeMe: Embracing Social Media, Adverse Events Included!").

[BTW, I'd like to see a pledge not to violate policies of patient social networks as part of every pharma company's social media principles. AstraZeneca, for example, might consider adding it to its "five important principles for online dialogue" (see "Transparency and Trust in Health Communications," posted by Bob Perkins, Vice President, Public Policy, AstraZeneca, on AZ Health Connections Blog).]

PLM community members, BTW, seem to be OK with PLM's sale of information to third parties: "I can live with you selling the information as long as you continue to reinvest in the infrastructure of the site and keep it more than just up to date," said one person in a comment to Ben's post. "Cutting edge is what I have found here and I expect you will still provide this."

You are probably asking yourself "Is John Mack OK with it?" Well, it's not something I would do. But it's a valid business model that provides useful information to clients. As I mention below, however, I think PLM should  also open the door to its proprietary database for the public good -- for free.

Now, let's discuss the "dark" aspect of all this.

"Closed communities" like the member areas of PLM are "dark" to search engine spiders, which are forbidden to troll these areas of PLM and index the content. There are technical ways of doing this, but most search engines agree to abide by requests that certain domains not be indexed.

This is something known as the Dark Net or Deep Web. According to wikipedia, "searching on the Internet today can be compared to dragging a net across the surface of the ocean; a great deal may be caught in the net, but there is a wealth of information that is deep and therefore missed. Most of the Web's information is buried far down on dynamically generated sites, and standard search engines do not find it. Traditional search engines cannot 'see' or retrieve content in the deep Web -- those pages do not exist until they are created dynamically as the result of a specific search. The deep Web is several orders of magnitude larger than the surface Web."

Most pages in forums like those on PLM are dynamically generated. However, it is possible to flip a "switch" that allows the page to be indexed (don't ask me how this works; I only know it's possible because I use such a switch on the Pharma Marketing Network Discussion Forums).

What this all means is that, potentially, much of the best patient-generated information found on social networks is "dark" to pharma companies unless the owners of these communities flip that "switch" or allow pharma marketers access (paid or otherwise). Hopefully, however, these sites might also perform a FREE public service such as what I talked about in this post: "If Patients Know Best, then Patient Social Networks Can Help Capture and Report AEs"

6 comments:

  1. This is interesting from the point of view that it demonstrates pharma is not strategically creative in the least. They are strip mining data without even considering returning anything to the community.

    ReplyDelete
  2. I have been watching the Facebook controversy unfold, waiting for it to come home to health care - and here it is.

    I appreciate your perspective on the business implications and thought you might be interested in the consumer implications, which I wrote about here:

    http://e-patients.net/archives/2010/05/a-new-conversation-about-health-privacy-whos-in.html

    Mark's phrase "strip mining" and Jamie's analogy of petty copper thievery reminds me of Janice McCallum's comment over on e-patients.net. I think we are talking not only about motivation, but also about scale. I think scale is part of what is tripping up Facebook, for example, and yet scale is what they need to be as useful as they are. Public health records need to be collected & maintained on a grand scale to be useful, but that's also why the gov't must to protect them and have clear policies about who can access them and for what purpose. PatientsLikeMe is in the middle - a dot-com doing the work of a public-health entity.

    ReplyDelete
  3. Hello John,

    I was wondering what is the right way for a media monitoring company to scan the data produced by PatientsLikeMe.com? Since the data is somewhat public, should it be possible to be monitored, even if an access fee is paid to it?

    Horatiu

    ReplyDelete
  4. Horatu,

    You will have to ask PLM that question. But from what I understand, monitoring companies need to pay PLM to access the data in CLOSED areas for them. There is no such thing as "somewhat" public, although there are open forums that are completely "public" -- not sure how useful they are, however.

    ReplyDelete
  5. Thanks John, I will ask them directly. By "somewhat" public, I wanted to say that (almost) everybody can be considered a patient, so it can have access to their forums.

    ReplyDelete
  6. Perhaps PLM follows the Sermo model. For a fee, Pfizer medical people (bona fide physicians) can join Sermo and participate in discussions and sponsor polls, etc. as long as they follow certain rules (eg, transparency). There are no equivalent bona fide "patients" that are employed at pharma companies although their agents may employ them. It's not clear if these agents get permission from PLM to allow their "patients" access with tools for monitoring the discussion.

    ReplyDelete