hunch_net hunch_net-2006 hunch_net-2006-173 knowledge-graph by maker-knowledge-mining

173 hunch net-2006-04-17-Rexa is live


meta infos for this blog

Source: html

Introduction: Rexa is now publicly available. Anyone can create an account and login. Rexa is similar to Citeseer and Google Scholar in functionality with more emphasis on the use of machine learning for intelligent information extraction. For example, Rexa can automatically display a picture on an author’s homepage when the author is searched for.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Rexa is similar to Citeseer and Google Scholar in functionality with more emphasis on the use of machine learning for intelligent information extraction. [sent-3, score-0.834]

2 For example, Rexa can automatically display a picture on an author’s homepage when the author is searched for. [sent-4, score-0.801]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('rexa', 0.71), ('citeseer', 0.237), ('scholar', 0.237), ('author', 0.229), ('picture', 0.213), ('functionality', 0.205), ('display', 0.205), ('publicly', 0.186), ('emphasis', 0.181), ('intelligent', 0.172), ('automatically', 0.154), ('google', 0.144), ('account', 0.137), ('anyone', 0.105), ('create', 0.099), ('similar', 0.095), ('information', 0.075), ('use', 0.054), ('example', 0.049), ('machine', 0.037), ('learning', 0.015)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 173 hunch net-2006-04-17-Rexa is live

Introduction: Rexa is now publicly available. Anyone can create an account and login. Rexa is similar to Citeseer and Google Scholar in functionality with more emphasis on the use of machine learning for intelligent information extraction. For example, Rexa can automatically display a picture on an author’s homepage when the author is searched for.

2 0.096778527 139 hunch net-2005-12-11-More NIPS Papers

Introduction: Let me add to John’s post with a few of my own favourites from this year’s conference. First, let me say that Sanjoy’s talk, Coarse Sample Complexity Bounds for Active Learning was also one of my favourites, as was the Forgettron paper . I also really enjoyed the last third of Christos’ talk on the complexity of finding Nash equilibria. And, speaking of tagging, I think the U.Mass Citeseer replacement system Rexa from the demo track is very cool. Finally, let me add my recommendations for specific papers: Z. Ghahramani, K. Heller: Bayesian Sets [no preprint] (A very elegant probabilistic information retrieval style model of which objects are “most like” a given subset of objects.) T. Griffiths, Z. Ghahramani: Infinite Latent Feature Models and the Indian Buffet Process [ preprint ] (A Dirichlet style prior over infinite binary matrices with beautiful exchangeability properties.) K. Weinberger, J. Blitzer, L. Saul: Distance Metric Lea

3 0.08750394 208 hunch net-2006-09-18-What is missing for online collaborative research?

Introduction: The internet has recently made the research process much smoother: papers are easy to obtain, citations are easy to follow, and unpublished “tutorials” are often available. Yet, new research fields can look very complicated to outsiders or newcomers. Every paper is like a small piece of an unfinished jigsaw puzzle: to understand just one publication, a researcher without experience in the field will typically have to follow several layers of citations, and many of the papers he encounters have a great deal of repeated information. Furthermore, from one publication to the next, notation and terminology may not be consistent which can further confuse the reader. But the internet is now proving to be an extremely useful medium for collaboration and knowledge aggregation. Online forums allow users to ask and answer questions and to share ideas. The recent phenomenon of Wikipedia provides a proof-of-concept for the “anyone can edit” system. Can such models be used to facilitate research a

4 0.075093925 65 hunch net-2005-05-02-Reviewing techniques for conferences

Introduction: The many reviews following the many paper deadlines are just about over. AAAI and ICML in particular were experimenting with several reviewing techniques. Double Blind: AAAI and ICML were both double blind this year. It seemed (overall) beneficial, but two problems arose. For theoretical papers, with a lot to say, authors often leave out the proofs. This is very hard to cope with under a double blind review because (1) you can not trust the authors got the proof right but (2) a blanket “reject” hits many probably-good papers. Perhaps authors should more strongly favor proof-complete papers sent to double blind conferences. On the author side, double blind reviewing is actually somewhat disruptive to research. In particular, it discourages the author from talking about the subject, which is one of the mechanisms of research. This is not a great drawback, but it is one not previously appreciated. Author feedback: AAAI and ICML did author feedback this year. It seem

5 0.072353102 278 hunch net-2007-12-17-New Machine Learning mailing list

Introduction: IMLS (which is the nonprofit running ICML) has setup a new mailing list for Machine Learning News . The list address is ML-news@googlegroups.com, and signup requires a google account (which you can create). Only members can send messages.

6 0.063251443 154 hunch net-2006-02-04-Research Budget Changes

7 0.058073178 468 hunch net-2012-06-29-ICML survey and comments

8 0.057870008 323 hunch net-2008-11-04-Rise of the Machines

9 0.052657358 461 hunch net-2012-04-09-ICML author feedback is open

10 0.052293368 168 hunch net-2006-04-02-Mad (Neuro)science

11 0.052016336 423 hunch net-2011-02-02-User preferences for search engines

12 0.04916282 267 hunch net-2007-10-17-Online as the new adjective

13 0.048350945 178 hunch net-2006-05-08-Big machine learning

14 0.048342414 452 hunch net-2012-01-04-Why ICML? and the summer conferences

15 0.045982964 116 hunch net-2005-09-30-Research in conferences

16 0.045695312 304 hunch net-2008-06-27-Reviewing Horror Stories

17 0.044597059 117 hunch net-2005-10-03-Not ICML

18 0.042772736 326 hunch net-2008-11-11-COLT CFP

19 0.04172571 331 hunch net-2008-12-12-Summer Conferences

20 0.039405987 148 hunch net-2006-01-13-Benchmarks for RL


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.056), (1, -0.041), (2, 0.014), (3, 0.016), (4, -0.001), (5, -0.012), (6, -0.021), (7, -0.001), (8, -0.011), (9, 0.001), (10, -0.012), (11, -0.023), (12, -0.003), (13, -0.012), (14, -0.009), (15, -0.009), (16, -0.046), (17, -0.035), (18, -0.002), (19, 0.009), (20, 0.032), (21, -0.003), (22, 0.008), (23, -0.001), (24, 0.025), (25, -0.027), (26, 0.066), (27, 0.09), (28, -0.053), (29, 0.018), (30, -0.056), (31, 0.028), (32, 0.096), (33, -0.037), (34, 0.026), (35, 0.082), (36, -0.015), (37, -0.024), (38, -0.03), (39, -0.023), (40, -0.047), (41, -0.024), (42, -0.069), (43, -0.025), (44, -0.003), (45, 0.038), (46, 0.005), (47, 0.058), (48, -0.013), (49, 0.002)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9551881 173 hunch net-2006-04-17-Rexa is live

Introduction: Rexa is now publicly available. Anyone can create an account and login. Rexa is similar to Citeseer and Google Scholar in functionality with more emphasis on the use of machine learning for intelligent information extraction. For example, Rexa can automatically display a picture on an author’s homepage when the author is searched for.

2 0.44668141 134 hunch net-2005-12-01-The Webscience Future

Introduction: The internet has significantly effected the way we do research but it’s capabilities have not yet been fully realized. First, let’s acknowledge some known effects. Self-publishing By default, all researchers in machine learning (and more generally computer science and physics) place their papers online for anyone to download. The exact mechanism differs—physicists tend to use a central repository ( Arxiv ) while computer scientists tend to place the papers on their webpage. Arxiv has been slowly growing in subject breadth so it now sometimes used by computer scientists. Collaboration Email has enabled working remotely with coauthors. This has allowed collaborationis which would not otherwise have been possible and generally speeds research. Now, let’s look at attempts to go further. Blogs (like this one) allow public discussion about topics which are not easily categorized as “a new idea in machine learning” (like this topic). Organization of some subfield

3 0.43368557 178 hunch net-2006-05-08-Big machine learning

Introduction: According to the New York Times , Yahoo is releasing Project Panama shortly . Project Panama is about better predicting which advertisements are relevant to a search, implying a higher click through rate, implying larger income for Yahoo . There are two things that seem interesting here: A significant portion of that improved accuracy is almost certainly machine learning at work. The quantitative effect is huge—the estimate in the article is $600*10 6 . Google already has such improvements and Microsoft Search is surely working on them, which suggest this is (perhaps) a $10 9 per year machine learning problem. The exact methodology under use is unlikely to be publicly discussed in the near future because of the competitive enivironment. Hopefully we’ll have some public “war stories” at some point in the future when this information becomes less sensitive. For now, it’s reassuring to simply note that machine learning is having a big impact.

4 0.42662942 331 hunch net-2008-12-12-Summer Conferences

Introduction: Here’s a handy table for the summer conferences. Conference Deadline Reviewer Targeting Double Blind Author Feedback Location Date ICML ( wrong ICML ) January 26 Yes Yes Yes Montreal, Canada June 14-17 COLT February 13 No No Yes Montreal June 19-21 UAI March 13 No Yes No Montreal June 19-21 KDD February 2/6 No No No Paris, France June 28-July 1 Reviewer targeting is new this year. The idea is that many poor decisions happen because the papers go to reviewers who are unqualified, and the hope is that allowing authors to point out who is qualified results in better decisions. In my experience, this is a reasonable idea to test. Both UAI and COLT are experimenting this year as well with double blind and author feedback, respectively. Of the two, I believe author feedback is more important, as I’ve seen it make a difference. However, I still consider double blind reviewing a net wi

5 0.41498724 326 hunch net-2008-11-11-COLT CFP

Introduction: Adam Klivans , points out the COLT call for papers . The important points are: Due Feb 13. Montreal, June 18-21. This year, there is author feedback.

6 0.41280812 278 hunch net-2007-12-17-New Machine Learning mailing list

7 0.41217548 323 hunch net-2008-11-04-Rise of the Machines

8 0.41156685 81 hunch net-2005-06-13-Wikis for Summer Schools and Workshops

9 0.40077761 354 hunch net-2009-05-17-Server Update

10 0.38106811 116 hunch net-2005-09-30-Research in conferences

11 0.37847382 65 hunch net-2005-05-02-Reviewing techniques for conferences

12 0.37534225 212 hunch net-2006-10-04-Health of Conferences Wiki

13 0.36006245 208 hunch net-2006-09-18-What is missing for online collaborative research?

14 0.35160342 10 hunch net-2005-02-02-Kolmogorov Complexity and Googling

15 0.35140479 342 hunch net-2009-02-16-KDNuggets

16 0.34830639 24 hunch net-2005-02-19-Machine learning reading groups

17 0.3425056 363 hunch net-2009-07-09-The Machine Learning Forum

18 0.33618805 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

19 0.33407357 122 hunch net-2005-10-13-Site tweak

20 0.33151293 117 hunch net-2005-10-03-Not ICML


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.051), (55, 0.146), (62, 0.589)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91810131 173 hunch net-2006-04-17-Rexa is live

Introduction: Rexa is now publicly available. Anyone can create an account and login. Rexa is similar to Citeseer and Google Scholar in functionality with more emphasis on the use of machine learning for intelligent information extraction. For example, Rexa can automatically display a picture on an author’s homepage when the author is searched for.

2 0.70117819 195 hunch net-2006-07-12-Who is having visa problems reaching US conferences?

Introduction: Many of the large machine learning conferences were in the US this summer. A common problem which students from abroad encounter is visa issues. Just getting a visa to visit can be pretty rough: you stand around in lines, sometimes for days. Even worse is the timing with respect to ticket buying. Airplane tickets typically need to be bought well in advance on nonrefundable terms to secure a reasonable rate for air travel. When a visa is denied, as happens reasonably often, a very expensive ticket is burnt. A serious effort is under way to raise this as in issue in need of fixing. Over the long term, effectively driving research conferences to locate outside of the US seems an unwise policy. Robert Schapire is planning to talk to a congressman. Sally Goldman suggested putting together a list of problem cases, and Phil Long setup an email address immigration.and.confs@gmail.com to collect them. If you (or someone you know) has had insurmountable difficulties reaching

3 0.60971045 394 hunch net-2010-04-24-COLT Treasurer is now Phil Long

Introduction: For about 5 years, I’ve been the treasurer of the Association for Computational Learning, otherwise known as COLT, taking over from John Case before me. A transfer of duties to Phil Long is now about complete. This probably matters to almost no one, but I wanted to describe things a bit for those interested. The immediate impetus for this decision was unhappiness over reviewing decisions at COLT 2009 , one as an author and several as a member of the program committee. I seem to have disagreements fairly often about what is important work, partly because I’m focused on learning theory with practical implications, partly because I define learning theory more broadly than is typical amongst COLT members, and partly because COLT suffers a bit from insider-clique issues. The degree to which these issues come up varies substantially each year so last year is not predictive of this one. And, it’s important to understand that COLT remains healthy with these issues not nearly so bad

4 0.55067849 128 hunch net-2005-11-05-The design of a computing cluster

Introduction: This is about the design of a computing cluster from the viewpoint of applied machine learning using current technology. We just built a small one at TTI so this is some evidence of what is feasible and thoughts about the design choices. Architecture There are several architectural choices. AMD Athlon64 based system. This seems to have the cheapest bang/buck. Maximum RAM is typically 2-3GB. AMD Opteron based system. Opterons provide the additional capability to buy an SMP motherboard with two chips, and the motherboards often support 16GB of RAM. The RAM is also the more expensive error correcting type. Intel PIV or Xeon based system. The PIV and Xeon based systems are the intel analog of the above 2. Due to architectural design reasons, these chips tend to run a bit hotter and be a bit more expensive. Dual core chips. Both Intel and AMD have chips that actually have 2 processors embedded in them. In the end, we decided to go with option (2). Roughly speaking,

5 0.48127806 263 hunch net-2007-09-18-It’s MDL Jim, but not as we know it…(on Bayes, MDL and consistency)

Introduction: I have recently completed a 500+ page-book on MDL , the first comprehensive overview of the field (yes, this is a sneak advertisement ). Chapter 17 compares MDL to a menagerie of other methods and paradigms for learning and statistics. By far the most time (20 pages) is spent on the relation between MDL and Bayes. My two main points here are: In sharp contrast to Bayes, MDL is by definition based on designing universal codes for the data relative to some given (parametric or nonparametric) probabilistic model M. By some theorems due to Andrew Barron , MDL inference must therefore be statistically consistent, and it is immune to Bayesian inconsistency results such as those by Diaconis, Freedman and Barron (I explain what I mean by “inconsistency” further below). Hence, MDL must be different from Bayes! In contrast to what has sometimes been claimed, practical MDL algorithms do have a subjective component (which in many, but not all cases, may be implemented by somethin

6 0.25412792 302 hunch net-2008-05-25-Inappropriate Mathematics for Machine Learning

7 0.25407073 20 hunch net-2005-02-15-ESPgame and image labeling

8 0.25405306 446 hunch net-2011-10-03-Monday announcements

9 0.25394514 271 hunch net-2007-11-05-CMU wins DARPA Urban Challenge

10 0.25317377 448 hunch net-2011-10-24-2011 ML symposium and the bears

11 0.25012597 90 hunch net-2005-07-07-The Limits of Learning Theory

12 0.24961615 472 hunch net-2012-08-27-NYAS ML 2012 and ICML 2013

13 0.24286455 270 hunch net-2007-11-02-The Machine Learning Award goes to …

14 0.24009053 326 hunch net-2008-11-11-COLT CFP

15 0.24009053 465 hunch net-2012-05-12-ICML accepted papers and early registration

16 0.23915415 331 hunch net-2008-12-12-Summer Conferences

17 0.23805961 395 hunch net-2010-04-26-Compassionate Reviewing

18 0.2344484 453 hunch net-2012-01-28-Why COLT?

19 0.22931075 387 hunch net-2010-01-19-Deadline Season, 2010

20 0.22525455 65 hunch net-2005-05-02-Reviewing techniques for conferences