hilary_mason_data hilary_mason_data-2011 hilary_mason_data-2011-59 knowledge-graph by maker-knowledge-mining

59 hilary mason data-2011-07-29-Uses This


meta infos for this blog

Source: html

Introduction: Uses This Posted: July 29, 2011 | Author: Hilary Mason | Filed under: Media | Tags: tools , usesthis | 1 Comment » I’m honored to have my tools of choice featured on Uses This !


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Uses This Posted: July 29, 2011 | Author: Hilary Mason | Filed under: Media | Tags: tools , usesthis | 1 Comment » I’m honored to have my tools of choice featured on Uses This ! [sent-1, score-1.844]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('uses', 0.516), ('tools', 0.516), ('featured', 0.427), ('honored', 0.385), ('july', 0.269), ('media', 0.223), ('comment', 0.109), ('mason', 0.05), ('tags', 0.02)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 59 hilary mason data-2011-07-29-Uses This

Introduction: Uses This Posted: July 29, 2011 | Author: Hilary Mason | Filed under: Media | Tags: tools , usesthis | 1 Comment » I’m honored to have my tools of choice featured on Uses This !

2 0.093288973 31 hilary mason data-2009-08-12-My NYC Python Meetup Presentation: Practical Data Analysis in Python

Introduction: My NYC Python Meetup Presentation: Practical Data Analysis in Python Posted: August 12, 2009 | Author: hilary | Filed under: blog | Tags: data , data analysis , nltk , presentations , python , spam , twitter | Leave a comment » I gave a talk at the NYC Python Meetup on July 29 on Practical Data Analysis in Python . I tend to use my slides for visual representations of the concepts I’m discussing, so there’s a lot of content that was in the presentation that you unfortunately won’t see here. The talk starts with the immense opportunities for knowledge derived from data. I spent some time showing data systems ‘in the wild’ along with the appropriate algorithmic vocabulary (for example, amazon.com ‘s ‘books you might like’ feature is a recommender system ). Once we can describe the problems properly, we can look for tools, and Python has many! Finally, in the fun part of the presentation, I demoed working code that uses NLTK to build a Twitter sp

3 0.072190315 78 hilary mason data-2012-09-21-Help, I’m the first data scientist at my company!

Introduction: Help, I’m the first data scientist at my company! Posted: September 21, 2012 | Author: Hilary Mason | Filed under: blog | Tags: datagotham , datascience , panel , presentations | 2 Comments » I moderated a panel at DataGotham  with Adam Laiacano from Tumblr ,  Fred Benenson from Kickstarter , and  Roberto Medri from Etsy  about being the first data scientist at a company. We covered everything from what people’s job responsibilities are, the tools they use, successes, failures, how they are integrated into an organization, and how they have hired other data scientists to join them. The panelists were concise, articulate, and intelligent. Watch it below!

4 0.06840416 46 hilary mason data-2010-08-15-Should you attend Hadoop World? Yes.

Introduction: Should you attend Hadoop World? Yes. Posted: August 15, 2010 | Author: hilary | Filed under: blog | Tags: conferences , hadoop , hadoopworld , questions | 2 Comments » I received this e-mail via my contact form : I just discovered you via a Google search because I’m highly considering attending this year’s upcoming Hadoop World in NYC. I appreciate your page that you wrote up after attending last year’s event. I’m wondering if you feel that Hadoop has enough momentum and support to be a “here to stay” technology worth investing one’s time and education into, or is it possible it might fade and be deprecated by something else as the need for big data analysis continues to grow? … I’ve had a few similar conversation with people lately, and I thought posting my response might help others making similar decisions. The e-mail is referencing my post from last year’s hadoop world NYC . Thanks for reaching out. There are several questions in your messa

5 0.065304875 110 hilary mason data-2013-10-06-What Mugshots Mean For Public Data

Introduction: What Mugshots Mean For Public Data Posted: October 6, 2013 | Author: Hilary Mason | Filed under: blog | Tags: data , mugshots , privacy | 20 Comments » The New York Times has a story this morning on the growing use of mugshot data for, essentially, extortion . These sites scrape mugshots off of public records databases, use SEO techniques to rank highly in Google searches for people’s names, and then charge those featured in the image to have the pages removed. Many of the people featured were never even convicted of a crime. What the mugshot story demonstrates but never says explicitly is that data is no longer just private or public, but often exists in an in-between state, where the public-ness of the data is a function of how much work is required to find it. Let’s say you’re actually doing a background check on someone you are going on a date with (one of the use cases the operators of these sites claim is common). Before online systems, you c

6 0.062464703 79 hilary mason data-2012-11-05-Where’s the API that can tell me that this photo contains a puppy and a can of Coke?

7 0.055053689 97 hilary mason data-2013-03-23-Why Google Now is Awesome

8 0.052679382 47 hilary mason data-2010-08-23-New York Times: Reinventing E-mail, One Message at a Time

9 0.051472627 20 hilary mason data-2008-07-21-Welcome

10 0.047209129 36 hilary mason data-2009-11-10-My code is on TV (and so am I)!

11 0.04328258 65 hilary mason data-2011-10-21-I’m on Fortune’s 40 Under 40: Ones to Watch list!

12 0.04322869 104 hilary mason data-2013-06-14-Speaking: Your Slides != Your Talk

13 0.04137627 76 hilary mason data-2012-08-28-How do you prioritize research?

14 0.038121682 49 hilary mason data-2010-11-10-Machine Learning: A Love Story

15 0.036746979 41 hilary mason data-2010-03-14-Art and Technology: Seven on Seven

16 0.032756858 63 hilary mason data-2011-09-26-Hacking the Food System: The Ultimate Chocolate Chip Cookie

17 0.031719159 85 hilary mason data-2013-01-19-Startups: How to Share Data with Academics

18 0.030751841 82 hilary mason data-2013-01-08-Bitly Social Data APIs

19 0.030420832 81 hilary mason data-2013-01-03-Interview Questions for Data Scientists

20 0.029461529 7 hilary mason data-2007-07-30-Tip: How to Search Google for Ideas


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, -0.073), (1, -0.057), (2, -0.014), (3, -0.028), (4, -0.025), (5, -0.03), (6, 0.045), (7, -0.053), (8, -0.076), (9, 0.001), (10, -0.005), (11, -0.125), (12, -0.114), (13, 0.023), (14, 0.139), (15, -0.029), (16, 0.022), (17, -0.134), (18, 0.062), (19, -0.012), (20, -0.064), (21, -0.331), (22, 0.081), (23, -0.221), (24, 0.193), (25, 0.029), (26, 0.026), (27, -0.107), (28, -0.032), (29, -0.267), (30, 0.06), (31, -0.182), (32, 0.202), (33, 0.121), (34, 0.226), (35, -0.164), (36, -0.005), (37, -0.029), (38, 0.077), (39, 0.088), (40, -0.022), (41, 0.049), (42, 0.028), (43, -0.104), (44, 0.007), (45, 0.067), (46, -0.165), (47, -0.055), (48, 0.141), (49, -0.048)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99692374 59 hilary mason data-2011-07-29-Uses This

Introduction: Uses This Posted: July 29, 2011 | Author: Hilary Mason | Filed under: Media | Tags: tools , usesthis | 1 Comment » I’m honored to have my tools of choice featured on Uses This !

2 0.31658849 31 hilary mason data-2009-08-12-My NYC Python Meetup Presentation: Practical Data Analysis in Python

Introduction: My NYC Python Meetup Presentation: Practical Data Analysis in Python Posted: August 12, 2009 | Author: hilary | Filed under: blog | Tags: data , data analysis , nltk , presentations , python , spam , twitter | Leave a comment » I gave a talk at the NYC Python Meetup on July 29 on Practical Data Analysis in Python . I tend to use my slides for visual representations of the concepts I’m discussing, so there’s a lot of content that was in the presentation that you unfortunately won’t see here. The talk starts with the immense opportunities for knowledge derived from data. I spent some time showing data systems ‘in the wild’ along with the appropriate algorithmic vocabulary (for example, amazon.com ‘s ‘books you might like’ feature is a recommender system ). Once we can describe the problems properly, we can look for tools, and Python has many! Finally, in the fun part of the presentation, I demoed working code that uses NLTK to build a Twitter sp

3 0.27639532 110 hilary mason data-2013-10-06-What Mugshots Mean For Public Data

Introduction: What Mugshots Mean For Public Data Posted: October 6, 2013 | Author: Hilary Mason | Filed under: blog | Tags: data , mugshots , privacy | 20 Comments » The New York Times has a story this morning on the growing use of mugshot data for, essentially, extortion . These sites scrape mugshots off of public records databases, use SEO techniques to rank highly in Google searches for people’s names, and then charge those featured in the image to have the pages removed. Many of the people featured were never even convicted of a crime. What the mugshot story demonstrates but never says explicitly is that data is no longer just private or public, but often exists in an in-between state, where the public-ness of the data is a function of how much work is required to find it. Let’s say you’re actually doing a background check on someone you are going on a date with (one of the use cases the operators of these sites claim is common). Before online systems, you c

4 0.27088764 79 hilary mason data-2012-11-05-Where’s the API that can tell me that this photo contains a puppy and a can of Coke?

Introduction: Where’s the API that can tell me that this photo contains a puppy and a can of Coke? Posted: November 5, 2012 | Author: Hilary Mason | Filed under: blog | Tags: api | 18 Comments » Photo by Ahmad van der Breggen on Flickr. We’ve gotten very good at extracting and disambiguation entities from text data. You can license a commodity system , and there are API and even open source tools that work fairly well. However, a large percentage of content that people share is not primarily text (a back-of-the-envelope guess says around 18%), and we currently have very little automated insight into that content. I know this is a very hard problem, but I’m continuously surprised by how few people seem to be working on it. Any ideas?

5 0.2435488 65 hilary mason data-2011-10-21-I’m on Fortune’s 40 Under 40: Ones to Watch list!

Introduction: I’m on Fortune’s 40 Under 40: Ones to Watch list! Posted: October 21, 2011 | Author: Hilary Mason | Filed under: Media | 1 Comment » I’m excited to be on Fortune’s 40 Under 40: Ones to Watch list! My world domination clock is ticking.

6 0.23650715 78 hilary mason data-2012-09-21-Help, I’m the first data scientist at my company!

7 0.19328932 36 hilary mason data-2009-11-10-My code is on TV (and so am I)!

8 0.18748482 63 hilary mason data-2011-09-26-Hacking the Food System: The Ultimate Chocolate Chip Cookie

9 0.17741765 47 hilary mason data-2010-08-23-New York Times: Reinventing E-mail, One Message at a Time

10 0.15832077 5 hilary mason data-2007-07-17-Where the Sun Rises… in Second Life

11 0.14949512 7 hilary mason data-2007-07-30-Tip: How to Search Google for Ideas

12 0.14852786 41 hilary mason data-2010-03-14-Art and Technology: Seven on Seven

13 0.14818703 104 hilary mason data-2013-06-14-Speaking: Your Slides != Your Talk

14 0.14308122 46 hilary mason data-2010-08-15-Should you attend Hadoop World? Yes.

15 0.14266467 29 hilary mason data-2009-05-07-I’m on Jon Udell’s Interviews with Innovators!

16 0.12858588 20 hilary mason data-2008-07-21-Welcome

17 0.12469102 45 hilary mason data-2010-07-26-A quick twitter bot, @bc l

18 0.1241705 53 hilary mason data-2011-03-11-Conference: PyCon 2011 Keynote!

19 0.12258022 82 hilary mason data-2013-01-08-Bitly Social Data APIs

20 0.11166076 76 hilary mason data-2012-08-28-How do you prioritize research?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(46, 0.713)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98116976 59 hilary mason data-2011-07-29-Uses This

Introduction: Uses This Posted: July 29, 2011 | Author: Hilary Mason | Filed under: Media | Tags: tools , usesthis | 1 Comment » I’m honored to have my tools of choice featured on Uses This !

2 0.050158661 46 hilary mason data-2010-08-15-Should you attend Hadoop World? Yes.

Introduction: Should you attend Hadoop World? Yes. Posted: August 15, 2010 | Author: hilary | Filed under: blog | Tags: conferences , hadoop , hadoopworld , questions | 2 Comments » I received this e-mail via my contact form : I just discovered you via a Google search because I’m highly considering attending this year’s upcoming Hadoop World in NYC. I appreciate your page that you wrote up after attending last year’s event. I’m wondering if you feel that Hadoop has enough momentum and support to be a “here to stay” technology worth investing one’s time and education into, or is it possible it might fade and be deprecated by something else as the need for big data analysis continues to grow? … I’ve had a few similar conversation with people lately, and I thought posting my response might help others making similar decisions. The e-mail is referencing my post from last year’s hadoop world NYC . Thanks for reaching out. There are several questions in your messa

3 0.040763643 31 hilary mason data-2009-08-12-My NYC Python Meetup Presentation: Practical Data Analysis in Python

Introduction: My NYC Python Meetup Presentation: Practical Data Analysis in Python Posted: August 12, 2009 | Author: hilary | Filed under: blog | Tags: data , data analysis , nltk , presentations , python , spam , twitter | Leave a comment » I gave a talk at the NYC Python Meetup on July 29 on Practical Data Analysis in Python . I tend to use my slides for visual representations of the concepts I’m discussing, so there’s a lot of content that was in the presentation that you unfortunately won’t see here. The talk starts with the immense opportunities for knowledge derived from data. I spent some time showing data systems ‘in the wild’ along with the appropriate algorithmic vocabulary (for example, amazon.com ‘s ‘books you might like’ feature is a recommender system ). Once we can describe the problems properly, we can look for tools, and Python has many! Finally, in the fun part of the presentation, I demoed working code that uses NLTK to build a Twitter sp

4 0.035283491 97 hilary mason data-2013-03-23-Why Google Now is Awesome

Introduction: Why Google Now is Awesome Posted: March 23, 2013 | Author: Hilary Mason | Filed under: blog | Tags: google | 11 Comments » Google Now is an extension to Google’s Android search app that uses all of the data that Google has about you along with what it can guess about your current context to present the information it thinks you need when it thinks you need it. It’ll tell you to leave a bit early to make your next calendar event because of heavy traffic, or that it’s a friend’s birthday, or that there’s a cool cafe nearby where you are. I think it’s amazing. It’s amazing because this is the first Google product that takes ALL OF THE DATA that they have about us and actually makes it useful for us . Not for advertisers. Finally.

5 0.033708245 79 hilary mason data-2012-11-05-Where’s the API that can tell me that this photo contains a puppy and a can of Coke?

Introduction: Where’s the API that can tell me that this photo contains a puppy and a can of Coke? Posted: November 5, 2012 | Author: Hilary Mason | Filed under: blog | Tags: api | 18 Comments » Photo by Ahmad van der Breggen on Flickr. We’ve gotten very good at extracting and disambiguation entities from text data. You can license a commodity system , and there are API and even open source tools that work fairly well. However, a large percentage of content that people share is not primarily text (a back-of-the-envelope guess says around 18%), and we currently have very little automated insight into that content. I know this is a very hard problem, but I’m continuously surprised by how few people seem to be working on it. Any ideas?

6 0.025028586 63 hilary mason data-2011-09-26-Hacking the Food System: The Ultimate Chocolate Chip Cookie

7 0.022211289 41 hilary mason data-2010-03-14-Art and Technology: Seven on Seven

8 0.019995103 104 hilary mason data-2013-06-14-Speaking: Your Slides != Your Talk

9 0.0 1 hilary mason data-2006-02-20-JavaScript Rotating Images Tutorial

10 0.0 2 hilary mason data-2006-05-04-Intro to the Linux Command Line

11 0.0 3 hilary mason data-2007-06-08-The Best Time to Search for Academic Jobs

12 0.0 4 hilary mason data-2007-06-11-Teaching Search Techniques with Google Games

13 0.0 5 hilary mason data-2007-07-17-Where the Sun Rises… in Second Life

14 0.0 6 hilary mason data-2007-07-27-Uninstall Programs … For Real.

15 0.0 7 hilary mason data-2007-07-30-Tip: How to Search Google for Ideas

16 0.0 8 hilary mason data-2007-08-19-Curriculum Design as Software Engineering

17 0.0 9 hilary mason data-2007-08-27-Second Life Community Convention

18 0.0 10 hilary mason data-2007-09-02-Autoscript Creates LSL Scripts Without Code

19 0.0 11 hilary mason data-2007-10-07-An Experience with Using a Wiki for a Collaborative Classroom Documentation Project

20 0.0 12 hilary mason data-2007-10-24-Teen Second Life College Fair