hunch_net hunch_net-2006 hunch_net-2006-208 knowledge-graph by maker-knowledge-mining

208 hunch net-2006-09-18-What is missing for online collaborative research?


meta infos for this blog

Source: html

Introduction: The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. Yet, new research fields can look very complicated to outsiders or newcomers. Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. Furthermore, from one publication to the next, notation and terminology may not be consistent which can further confuse the reader. But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. Online forums allow users to ask and answer questions and to share ideas. The recent phenomenon of Wikipedia provides a proof-of-concept for the “anyone can edit” system. Can such models be used to facilitate research a


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. [sent-1, score-0.307]

2 Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. [sent-3, score-0.243]

3 But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. [sent-5, score-0.295]

4 Can such models be used to facilitate research and collaboration? [sent-8, score-0.26]

5 This could potentially be extremely useful for newcomers and experts alike. [sent-9, score-0.163]

6 On the other hand, entities of this sort already exist to some extent: Wikipedia::Machine Learning , MLpedia , the discussion boards on kernel-machines. [sent-10, score-0.424]

7 None of these have yet achieved takeoff velocity. [sent-12, score-0.347]

8 You’ll know that takeoff velocity has been achieved when these become a necessary part of daily life rather than a frill. [sent-13, score-0.439]

9 Each of these efforts seems to be missing critical pieces, such as: A framework for organizing and summarizing information. [sent-14, score-0.226]

10 Wikipedia and MLpedia are good examples, yet this is not as well solved as you might hope as mathematics on the web is still more awkward than it should be. [sent-15, score-0.157]

11 There does exist a discussion framework on Wikipedia/MLpedia, but the presentation format marginalizes discussion, placed on a separate page and generally not viewed by most observers. [sent-19, score-0.427]

12 Wikipedia intentionally anonymizes contributors in the presentation, because recognizing them might invite the wrong sort of contributor. [sent-22, score-0.23]

13 Given this constraint, it would be very handy if a system could automatically translate a subset of an online site into a paper, with authorship automatically summarized. [sent-25, score-0.516]

14 The site itself might also track and display who has contributed how much and who has contributed recently. [sent-26, score-0.487]

15 If you get 3 good researchers on a topic in a room, you might have about 5 distinct opinions. [sent-28, score-0.156]

16 Given that disagreement is a part of the process of research, there needs to be a way to facilitate, and even spotlight, disagreements for a healthy online research mechanism. [sent-30, score-0.444]

17 One crude system for handling disagreements is illustrated by the linux kernel “anyone can download and start their own kernel tree”. [sent-31, score-0.535]

18 A more fine-grained version of this may be effective “anyone can clone a webpage and start their own version of it”. [sent-32, score-0.275]

19 Many systems handle this well, but it must be emphasized because small changes in the barrier to entry can have a large effect on (6). [sent-38, score-0.21]

20 Can a site be created that simultaneously handles all of the necessary pieces for online research? [sent-42, score-0.466]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('wikipedia', 0.368), ('mlpedia', 0.172), ('takeoff', 0.172), ('framework', 0.149), ('handles', 0.142), ('facilitate', 0.142), ('site', 0.136), ('handling', 0.134), ('contributed', 0.134), ('disagreements', 0.134), ('incentives', 0.128), ('collaboration', 0.128), ('publication', 0.123), ('discussion', 0.122), ('entry', 0.118), ('research', 0.118), ('citations', 0.108), ('achieved', 0.101), ('system', 0.099), ('version', 0.099), ('online', 0.097), ('follow', 0.095), ('anyone', 0.095), ('part', 0.095), ('handle', 0.092), ('automatically', 0.092), ('none', 0.092), ('pieces', 0.091), ('extremely', 0.086), ('kernel', 0.084), ('presentation', 0.084), ('might', 0.083), ('internet', 0.081), ('unfinished', 0.077), ('boards', 0.077), ('clone', 0.077), ('entities', 0.077), ('gradual', 0.077), ('newcomers', 0.077), ('summarizing', 0.077), ('sort', 0.076), ('yet', 0.074), ('researchers', 0.073), ('exist', 0.072), ('forums', 0.071), ('rexa', 0.071), ('layers', 0.071), ('daily', 0.071), ('intentionally', 0.071), ('obtain', 0.071)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000005 208 hunch net-2006-09-18-What is missing for online collaborative research?

Introduction: The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. Yet, new research fields can look very complicated to outsiders or newcomers. Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. Furthermore, from one publication to the next, notation and terminology may not be consistent which can further confuse the reader. But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. Online forums allow users to ask and answer questions and to share ideas. The recent phenomenon of Wikipedia provides a proof-of-concept for the “anyone can edit” system. Can such models be used to facilitate research a

2 0.30018416 134 hunch net-2005-12-01-The Webscience Future

Introduction: The internet has significantly effected the way we do research but it’s capabilities have not yet been fully realized. First, let’s acknowledge some known effects. Self-publishing By default, all researchers in machine learning (and more generally computer science and physics) place their papers online for anyone to download. The exact mechanism differs—physicists tend to use a central repository ( Arxiv ) while computer scientists tend to place the papers on their webpage. Arxiv has been slowly growing in subject breadth so it now sometimes used by computer scientists. Collaboration Email has enabled working remotely with coauthors. This has allowed collaborationis which would not otherwise have been possible and generally speeds research. Now, let’s look at attempts to go further. Blogs (like this one) allow public discussion about topics which are not easily categorized as “a new idea in machine learning” (like this topic). Organization of some subfield

3 0.16308658 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

Introduction: How do you create an optimal environment for research? Here are some essential ingredients that I see. Stability . University-based research is relatively good at this. On any particular day, researchers face choices in what they will work on. A very common tradeoff is between: easy small difficult big For researchers without stability, the ‘easy small’ option wins. This is often “ok”—a series of incremental improvements on the state of the art can add up to something very beneficial. However, it misses one of the big potentials of research: finding entirely new and better ways of doing things. Stability comes in many forms. The prototypical example is tenure at a university—a tenured professor is almost imposssible to fire which means that the professor has the freedom to consider far horizon activities. An iron-clad guarantee of a paycheck is not necessary—industrial research labs have succeeded well with research positions of indefinite duration. Atnt rese

4 0.11037668 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

Introduction: In the quest to understand what good reviewing is, perhaps it’s worthwhile to think about what good research is. One way to think about good research is in terms of a producer/consumer model. In the producer/consumer model of research, for any element of research there are producers (authors and coauthors of papers, for example) and consumers (people who use the papers to make new papers or code solving problems). An produced bit of research is judged as “good” if it is used by many consumers. There are two basic questions which immediately arise: Is this a good model of research? Are there alternatives? The producer/consumer model has some difficulties which can be (partially) addressed. Disconnect. A group of people doing research on some subject may become disconnected from the rest of the world. Each person uses the research of other people in the group so it appears good research is being done, but the group has no impact on the rest of the world. One way

5 0.10730885 30 hunch net-2005-02-25-Why Papers?

Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo

6 0.10319939 288 hunch net-2008-02-10-Complexity Illness

7 0.10304727 297 hunch net-2008-04-22-Taking the next step

8 0.099683747 343 hunch net-2009-02-18-Decision by Vetocracy

9 0.099276185 233 hunch net-2007-02-16-The Forgetting

10 0.098911673 378 hunch net-2009-11-15-The Other Online Learning

11 0.097673409 235 hunch net-2007-03-03-All Models of Learning have Flaws

12 0.097332269 207 hunch net-2006-09-12-Incentive Compatible Reviewing

13 0.096086845 485 hunch net-2013-06-29-The Benefits of Double-Blind Review

14 0.093855903 307 hunch net-2008-07-04-More Presentation Preparation

15 0.093212001 454 hunch net-2012-01-30-ICML Posters and Scope

16 0.091953427 344 hunch net-2009-02-22-Effective Research Funding

17 0.091545507 437 hunch net-2011-07-10-ICML 2011 and the future

18 0.091514409 98 hunch net-2005-07-27-Not goal metrics

19 0.090776622 363 hunch net-2009-07-09-The Machine Learning Forum

20 0.08921276 484 hunch net-2013-06-16-Representative Reviewing


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.239), (1, -0.055), (2, -0.03), (3, 0.123), (4, -0.028), (5, 0.027), (6, -0.017), (7, 0.01), (8, -0.029), (9, 0.071), (10, 0.014), (11, -0.024), (12, -0.033), (13, 0.005), (14, 0.058), (15, -0.006), (16, -0.073), (17, 0.037), (18, 0.044), (19, 0.021), (20, -0.034), (21, -0.058), (22, -0.07), (23, 0.015), (24, 0.069), (25, 0.0), (26, 0.01), (27, 0.077), (28, -0.089), (29, -0.058), (30, 0.061), (31, 0.035), (32, 0.095), (33, -0.079), (34, -0.006), (35, 0.005), (36, 0.032), (37, -0.02), (38, 0.031), (39, 0.003), (40, -0.037), (41, 0.061), (42, -0.089), (43, -0.034), (44, -0.075), (45, -0.054), (46, 0.029), (47, 0.087), (48, 0.127), (49, -0.031)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96040189 208 hunch net-2006-09-18-What is missing for online collaborative research?

Introduction: The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. Yet, new research fields can look very complicated to outsiders or newcomers. Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. Furthermore, from one publication to the next, notation and terminology may not be consistent which can further confuse the reader. But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. Online forums allow users to ask and answer questions and to share ideas. The recent phenomenon of Wikipedia provides a proof-of-concept for the “anyone can edit” system. Can such models be used to facilitate research a

2 0.85834628 134 hunch net-2005-12-01-The Webscience Future

Introduction: The internet has significantly effected the way we do research but it’s capabilities have not yet been fully realized. First, let’s acknowledge some known effects. Self-publishing By default, all researchers in machine learning (and more generally computer science and physics) place their papers online for anyone to download. The exact mechanism differs—physicists tend to use a central repository ( Arxiv ) while computer scientists tend to place the papers on their webpage. Arxiv has been slowly growing in subject breadth so it now sometimes used by computer scientists. Collaboration Email has enabled working remotely with coauthors. This has allowed collaborationis which would not otherwise have been possible and generally speeds research. Now, let’s look at attempts to go further. Blogs (like this one) allow public discussion about topics which are not easily categorized as “a new idea in machine learning” (like this topic). Organization of some subfield

3 0.68785882 1 hunch net-2005-01-19-Why I decided to run a weblog.

Introduction: I have decided to run a weblog on machine learning and learning theory research. Here are some reasons: 1) Weblogs enable new functionality: Public comment on papers. No mechanism for this exists at conferences and most journals. I have encountered it once for a science paper. Some communities have mailing lists supporting this, but not machine learning or learning theory. I have often read papers and found myself wishing there was some method to consider other’s questions and read the replies. Conference shortlists. One of the most common conversations at a conference is “what did you find interesting?” There is no explicit mechanism for sharing this information at conferences, and it’s easy to imagine that it would be handy to do so. Evaluation and comment on research directions. Papers are almost exclusively about new research, rather than evaluation (and consideration) of research directions. This last role is satisfied by funding agencies to some extent, but

4 0.63234252 30 hunch net-2005-02-25-Why Papers?

Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo

5 0.59578878 297 hunch net-2008-04-22-Taking the next step

Introduction: At the last ICML , Tom Dietterich asked me to look into systems for commenting on papers. I’ve been slow getting to this, but it’s relevant now. The essential observation is that we now have many tools for online collaboration, but they are not yet much used in academic research. If we can find the right way to use them, then perhaps great things might happen, with extra kudos to the first conference that manages to really create an online community. Various conferences have been poking at this. For example, UAI has setup a wiki , COLT has started using Joomla , with some dynamic content, and AAAI has been setting up a “ student blog “. Similarly, Dinoj Surendran setup a twiki for the Chicago Machine Learning Summer School , which was quite useful for coordinating events and other things. I believe the most important thing is a willingness to experiment. A good place to start seems to be enhancing existing conference websites. For example, the ICML 2007 papers pag

6 0.58939737 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

7 0.58666897 288 hunch net-2008-02-10-Complexity Illness

8 0.57791561 98 hunch net-2005-07-27-Not goal metrics

9 0.57620078 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

10 0.57117635 231 hunch net-2007-02-10-Best Practices for Collaboration

11 0.5648638 233 hunch net-2007-02-16-The Forgetting

12 0.55146509 172 hunch net-2006-04-14-JMLR is a success

13 0.55044538 323 hunch net-2008-11-04-Rise of the Machines

14 0.54437113 295 hunch net-2008-04-12-It Doesn’t Stop

15 0.53784233 363 hunch net-2009-07-09-The Machine Learning Forum

16 0.53749698 399 hunch net-2010-05-20-Google Predict

17 0.53054017 76 hunch net-2005-05-29-Bad ideas

18 0.52127045 333 hunch net-2008-12-27-Adversarial Academia

19 0.51796281 344 hunch net-2009-02-22-Effective Research Funding

20 0.51700979 106 hunch net-2005-09-04-Science in the Government


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.017), (10, 0.02), (20, 0.294), (27, 0.167), (38, 0.044), (53, 0.082), (55, 0.117), (56, 0.015), (62, 0.02), (94, 0.072), (95, 0.073)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.86461771 208 hunch net-2006-09-18-What is missing for online collaborative research?

Introduction: The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. Yet, new research fields can look very complicated to outsiders or newcomers. Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. Furthermore, from one publication to the next, notation and terminology may not be consistent which can further confuse the reader. But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. Online forums allow users to ask and answer questions and to share ideas. The recent phenomenon of Wikipedia provides a proof-of-concept for the “anyone can edit” system. Can such models be used to facilitate research a

2 0.84043628 7 hunch net-2005-01-31-Watchword: Assumption

Introduction: “Assumption” is another word to be careful with in machine learning because it is used in several ways. Assumption = Bias There are several ways to see that some form of ‘bias’ (= preferring of one solution over another) is necessary. This is obvious in an adversarial setting. A good bit of work has been expended explaining this in other settings with “ no free lunch ” theorems. This is a usage specialized to learning which is particularly common when talking about priors for Bayesian Learning. Assumption = “if” of a theorem The assumptions are the ‘if’ part of the ‘if-then’ in a theorem. This is a fairly common usage. Assumption = Axiom The assumptions are the things that we assume are true, but which we cannot verify. Examples are “the IID assumption” or “my problem is a DNF on a small number of bits”. This is the usage which I prefer. One difficulty with any use of the word “assumption” is that you often encounter “if assumption then conclusion so if no

3 0.7520417 464 hunch net-2012-05-03-Microsoft Research, New York City

Introduction: Yahoo! laid off people . Unlike every previous time there have been layoffs, this is serious for Yahoo! Research . We had advanced warning from Prabhakar through the simple act of leaving . Yahoo! Research was a world class organization that Prabhakar recruited much of personally, so it is deeply implausible that he would spontaneously decide to leave. My first thought when I saw the news was “Uhoh, Rob said that he knew it was serious when the head of ATnT Research left.” In this case it was even more significant, because Prabhakar recruited me on the premise that Y!R was an experiment in how research should be done: via a combination of high quality people and high engagement with the company. Prabhakar’s departure is a clear end to that experiment. The result is ambiguous from a business perspective. Y!R clearly was not capable of saving the company from its illnesses. I’m not privy to the internal accounting of impact and this is the kind of subject where there c

4 0.74336958 190 hunch net-2006-07-06-Branch Prediction Competition

Introduction: Alan Fern points out the second branch prediction challenge (due September 29) which is a follow up to the first branch prediction competition . Branch prediction is one of the fundamental learning problems of the computer age: without it our computers might run an order of magnitude slower. This is a tough problem since there are sharp constraints on time and space complexity in an online environment. For machine learning, the “idealistic track” may fit well. Essentially, they remove these constraints to gain a weak upper bound on what might be done.

5 0.70594245 116 hunch net-2005-09-30-Research in conferences

Introduction: Conferences exist as part of the process of doing research. They provide many roles including “announcing research”, “meeting people”, and “point of reference”. Not all conferences are alike so a basic question is: “to what extent do individual conferences attempt to aid research?” This question is very difficult to answer in any satisfying way. What we can do is compare details of the process across multiple conferences. Comments The average quality of comments across conferences can vary dramatically. At one extreme, the tradition in CS theory conferences is to provide essentially zero feedback. At the other extreme, some conferences have a strong tradition of providing detailed constructive feedback. Detailed feedback can give authors significant guidance about how to improve research. This is the most subjective entry. Blind Virtually all conferences offer single blind review where authors do not know reviewers. Some also provide double blind review where rev

6 0.6755504 351 hunch net-2009-05-02-Wielding a New Abstraction

7 0.62232351 437 hunch net-2011-07-10-ICML 2011 and the future

8 0.61514086 466 hunch net-2012-06-05-ICML acceptance statistics

9 0.61040133 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

10 0.61027402 141 hunch net-2005-12-17-Workshops as Franchise Conferences

11 0.6100589 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

12 0.6097874 343 hunch net-2009-02-18-Decision by Vetocracy

13 0.60941744 454 hunch net-2012-01-30-ICML Posters and Scope

14 0.60873431 225 hunch net-2007-01-02-Retrospective

15 0.60855007 423 hunch net-2011-02-02-User preferences for search engines

16 0.606619 40 hunch net-2005-03-13-Avoiding Bad Reviewing

17 0.60647327 207 hunch net-2006-09-12-Incentive Compatible Reviewing

18 0.60617715 134 hunch net-2005-12-01-The Webscience Future

19 0.60575199 297 hunch net-2008-04-22-Taking the next step

20 0.60479176 403 hunch net-2010-07-18-ICML & COLT 2010