alise all values ofD to I/N e/(l-e) ∑ umber of iterations weak-learn (D, weak learner with distribution D, classifier nal boosted classifier Fig. 5. Ada boostMi boosting algorithm AdaboostMI has been shown to improve the performance of weak learner algorithms [Freund and Schapire 1996], particularly for the stronger learning algorithms like k Nearest Neighbour. It is thus a sensible choice to boost our ibk classifier Other types of classifier were considered, including the naive Bayes r and the 4.5 decision tree, and informal tests run to evaluate their performance. The boosted IBk classifier was found to give superior performance for this domain 2.2.4 Web page interface Recommendations are presented to the user via a browser web page, shown in figure 6. The web page applet loads the current recommendation set and records any feedback display the paper URL. If the user likes or dislikes a paper topic, the interest feedback combo-box allows"interested" or"not interested"to replace the default"no comment
Initialise all values of D to 1/N Do for t=1..T call weak-learn(Dt ) calculate error et calculate βt = et /(1-et ) calculate Dt+1 Dt class weight distribution on iteration t N number of classes T number of iterations weak-learn(Dt ) weak learner with distribution Dt et weak_learn error on iteration t βt error adjustment value on iteration t classifier final boosted classifier C all classes classifier = argmax Σ log t = all iterations with result class c c ∈ C βt 1 __ Initialise all values of D to 1/N Do for t=1..T call weak-learn(Dt ) calculate error et calculate βt = et /(1-et ) calculate Dt+1 Dt class weight distribution on iteration t N number of classes T number of iterations weak-learn(Dt ) weak learner with distribution Dt et weak_learn error on iteration t βt error adjustment value on iteration t classifier final boosted classifier C all classes classifier = argmax Σ log t = all iterations with result class c c ∈ C βt 1 __ classifier = argmax Σ log t = all iterations with result class c c ∈ C βt 1 __ Fig. 5. AdaBoostM1 boosting algorithm AdaBoostM1 has been shown to improve the performance of weak learner algorithms [Freund and Schapire 1996], particularly for the stronger learning algorithms like kNearest Neighbour. It is thus a sensible choice to boost our IBk classifier. Other types of classifier were considered, including the naïve Bayes classifier and the C4.5 decision tree, and informal tests run to evaluate their performance. The boosted IBk classifier was found to give superior performance for this domain. 2.2.4 Web page interface Recommendations are presented to the user via a browser web page, shown in figure 6. The web page applet loads the current recommendation set and records any feedback the user provides. Research papers can be jumped to, opening a new browser window to display the paper URL. If the user likes or dislikes a paper topic, the interest feedback combo-box allows “interested” or “not interested” to replace the default “no comment
ahpl ole/Logout Redommendalllonns 5RD加群真 02139. ARIABLE path oTe CoMmer 0n知 Tays Jonathan E. Psyche Mlian C. Doreen.s涌a erate genes「Come Fig. 6. Quicksteps web-based interface licking on the topic and selecting a new one from a popup menu can change the topic of each paper, should the user feel the classification is incorrect. In the experiment later the ontology group has a hierarchical popup menu, and the flat list group has a single level popup menu. Figure 7 shows the hierarchical popup menu M∥耐 x9h/UTHu/ RecannendatienP界R gulckateb Login Recommendations/Add Example/Logout Redommmendalon ing ad Uker heptad interaction Imgracton er enman pense ans Tenace sssn nadine Penn Modal Bases on stamchants puer stie Fig. 7. Topic popup menus New examples can be added via the interface, with users providing a paper URL and a topic label. These are added to the groups training set, allowing users to teach the system new topics or improve classification of old ones ck is stored in log files, ready for the profile builders run. The feedback are also used as the primary metric for evaluation, Interest feedback, topic corrections jumps to recommended papers are all recorded
Fig. 6. Quickstep’s web-based interface Clicking on the topic and selecting a new one from a popup menu can change the topic of each paper, should the user feel the classification is incorrect. In the experiment later the ontology group has a hierarchical popup menu, and the flat list group has a single level popup menu. Figure 7 shows the hierarchical popup menu. Fig. 7. Topic popup menus New examples can be added via the interface, with users providing a paper URL and a topic label. These are added to the groups training set, allowing users to teach the system new topics or improve classification of old ones. All feedback is stored in log files, ready for the profile builders run. The feedback logs are also used as the primary metric for evaluation. Interest feedback, topic corrections and jumps to recommended papers are all recorded
Interest profiles are computed daily by correlating previously browsed research papers with their classification. User profiles thus hold a set of topics and interest values in these topics for each day of the trial. User feedback also adjusts the interest of topics within the than older ones. Ontological relationships between topics of interest are used to infer other topics of interest, which might not have been browsed explicitly; an instance of an interest value for a specific class adds 50% of its value to the super-class. Figure 8 shows the profiling algorithm Event Paper browsed =1 interest values Topic rated interesting= 10 Topic rated not interesting=-10 Interest value for 50% of sub-class Profile feedback details a level of interest in a topic over a period of time. The user defines the exact level and duration of interests when they draw interest bars onto the time/interest graph via the profile interface. The profiling algorithm automatically adjusts the daily profiles to match any topic interest levels declared via profile feedback Event interest values were chosen to favour explicit feedback over implicit, and the 50% value used to represent the reduction in confidence you get the further from the lon you are Other profiling algorithms exist such as time-slicing and curve fitting, but the time- decay function appeared in informal tests to produce a good result; we found it to be a robust function for finding current interests 2.2.6 Recommender Recommendations are formulated from a correlation between the users'current topics interest and papers classified as belonging to those topics. A paper is only recommended if it does not appear in the users browsed URL log, ensuring that recommendations have not been seen before. For each user, the top three interesting topics are selected with 10 recommendations made in total. Papers are ranked in order of the recommendation confidence before being presented to the user
2.2.5 Profiler Interest profiles are computed daily by correlating previously browsed research papers with their classification. User profiles thus hold a set of topics and interest values in these topics for each day of the trial. User feedback also adjusts the interest of topics within the profile and a time decay function weights recently seen papers as being more important than older ones. Ontological relationships between topics of interest are used to infer other topics of interest, which might not have been browsed explicitly; an instance of an interest value for a specific class adds 50% of its value to the super-class. Figure 8 shows the profiling algorithm. ∑ n 1..no of instances Topic interest = Interest value(n) / days old(n) Event interest values Paper browsed = 1 Recommendation followed = 2 Topic rated interesting = 10 Topic rated not interesting = -10 Interest value for super-class per instance = 50% of sub-class ∑ n 1..no of instances Topic interest = Interest value(n) / days old(n) Event interest values Paper browsed = 1 Recommendation followed = 2 Topic rated interesting = 10 Topic rated not interesting = -10 Interest value for super-class per instance = 50% of sub-class Fig. 8. Profiling algorithm Profile feedback details a level of interest in a topic over a period of time. The user defines the exact level and duration of interests when they draw interest bars onto the time/interest graph via the profile interface. The profiling algorithm automatically adjusts the daily profiles to match any topic interest levels declared via profile feedback. Event interest values were chosen to favour explicit feedback over implicit, and the 50% value used to represent the reduction in confidence you get the further from the direct observation you are. Other profiling algorithms exist such as time-slicing and curve fitting, but the timedecay function appeared in informal tests to produce a good result; we found it to be a robust function for finding current interests. 2.2.6 Recommender Recommendations are formulated from a correlation between the users’ current topics of interest and papers classified as belonging to those topics. A paper is only recommended if it does not appear in the users browsed URL log, ensuring that recommendations have not been seen before. For each user, the top three interesting topics are selected with 10 recommendations made in total. Papers are ranked in order of the recommendation confidence before being presented to the user