On Assigning Implicit Reputation Scores in an Online Labor Marketplace

On Assigning Implicit Reputation Scores in an Online Labor Marketplace
Maria Daltayanni, Luca de Alfaro (UCSC), Panagiotis Papadimitriou (oDesk), Panayiotis Tsaparas (UoI) {mariadal,

Labor Marketplace In online labor marketplaces employers post job openings and receive applications by workers interested in them. Data about labor marketplaces are usually repressented with a bipartite graph as shown. The employers decide which applicant to hire and then they work with the selected worker to accomplish the job requirements. At the end of the contract, an employer can provide his worker with some feedback rating that becomes visible in the online worker profile and can guide future hiring decisions of other employers. Usually, the possible outcomes of an application, in order of decreasing employer satisfaction, are offer(hire) > interview > shortlist > ignore > hide > reject. The employer’s feedback rating is an explicit form of reputation for the workers. He selects a number of stars that reflects his belief about the worker quality on the accomplished job.

Which candidate to hire?
Use Past Employer Ratings (Explicit Reputation) scores usually skewed towards high ratings very sparse, since a worker needs to apply, get hired and complete few jobs before he obtains a representative reputation score However, as shown in the figure, most employers tend to provide workers with 5 star ratings. Hence explicit feedback does not help discriminate between good workers. At the same time, explicit stars feedback is provided only for workers who eventually got hired and accomplished a job. We not receive any information, though, about the rest candidates who participate in the application process and who did not receive an offer in the last stage.

Which candidate to hire?
WorkerRank: Use Past Jobs Application Data (Implicit Reputation) Which workers applied to which jobs? Did they get hired? Interviewed, shortlisted, ignored, hidden, rejected? User-profiling To deal with these two basic problems, the authors devise an algorithm which examines the application data, that is, who applied to which job and what was the hiring outcome. Did the applicant get hired, rejected, etc. This application data provides implicit information about the quality of applicants. For example, the quality of a candidate who got rejected at a job is worse than that of a candidate who was shortlisted by the employer.

WorkerRank ‘PageRank’ for workers and employers
General score computation rule: score(worker) += weight * score(employer) , for all employers score(employer) += weight * score(worker) , for all workers Weights include info about the hiring outcome How to specify weights? The used algorithm is based on PageRank where the employers are the hubs and the workers are the authorities The general score computation rule is uses weights. Weights on the edges are used to represent the quality info propagated from worker to employer and vice versa. For example, if a worker got an offer from Google, that would give him higher score than a worker who got hired by a startup since the startup has much fewer incoming edges from applicants. Our algorithm uses HITS (the pagerank algorithm) on the application data. An rwthsei kanenas: (The employers are the hubs The workers are the authorities.)

Relativity k: rank of worker, if we sort them by hiring outcome
An example job with 5 candidates can be: k hiring 1 offer 2 shortlist 3 hide 4 hide 5 reject We may allow for fixed weights, e.g. offer = 3, interview = 2, shortlist = 1, ignore = -1, hide = -2, reject = -3 Or compute relative weights for each job as shown in the figure It is important to consider Relativity, that is, what are the hiring outcomes for the two candidates – Essentially we want to measure “how much better” is one candidate versus the other E.g. offer versus shortlist or offer versus reject

Selectivity n: total number of candidates
k: number of candidates who received offer Selectivity: (n-k) / n In addition it is important to consider selectivity (competitiveness): how many candidates applied to the job – it is more competitive to get an offer at a job of 100 candidates than at one with 10 candidates

Elo Ratings: Example Initial Hiring Elo Score Outcome Scores 100
rejected 96 win/ lose 100 ignored 98 In the very next step Elo ratings are applied, which cover both relativity and selectivity in a principled way. Elo ratings perform pairwise comparisons among players and assign implicit reputation scores based on Relativity and competitiveness(selectivity). draw 100 ignored 98 100 hired 108

Skills: Further Work Usage of Skills
Derive skill-wise reputation scores For example, we may recommend candidates for a job that requires java Using their java scores As a further work, the authors consider usage of skills; jobs require particular skills and workers claim to have expertise on some skills. The graph can be expanded such that there is one node per (worker, skill) pair instead of worker, and (employer, skill) pair instead of employer. (In fact we use job nodes, which correspond to employers who posted the jobs). WorkerRank can run and export reputation results about worker,skill pairs. Thus instead of learning one score for each worker, we now learn one score for each skill of the worker. For example, an employer who posts a job requiring java, is interested in the java score of a candidate and he does not ask for his python score.

Skills: Further Work Expanding job applications graph to support skills Here is an example graph where we show how it is expanded when we add the skills information. It is obvious that the new graph can be considered as a set of skill-wise graphs, e.g. one java graph, one python graph, etc.

Set Up Dataset: by oDesk Real world application data
53 weeks between March 2012 – March 2013 ~17M applications ~0.5M workers ~0.8M jobs ~0.2 employers For evaluating of WorkerRank application data from the online labor marketplace company oDesk was used. The dataset spans 53 weeks and contains approximatelly ~17M applications ~0.5M workers ~0.8M jobs ~0.2 employers

Results Coverage of Worker Applications: +50.7% Cold Start
In this image you can see on the Y-axis the percentage of workers for whom we have obtained reputation after x weeks, which is indicated on the x-axis of the image.The experiments show that the coverage of workers grows by 50.7%. The cold start improvement pertains to how fast we acquire reputation information with each method. Implicit reputation becomes available for about 100% of the new workers joining the platform after 12 weeks. It is very interesting that explicit feedback becomes available for almost 4% of the new workers after 12 weeks that they have joined the platform.

Results Lift(x) = Finally, the initial goal was to learn a reputation score for each worker so that we can sort the candidates of a job by quality and recommend the good workers to the employer. The lift metric is used as the hiring probability of the top x%(per cent) applicants divided by the hiring probability across all applicants This result shows that WorkerRank improves the ranking accuracy compared to explicit feedback. In particular it almost doubles the probability of ranking a good quality worker at the top 0.5% of the list. As a reminder the top ranked workers are of greater interest for our system since the employers will usually look only at the top ~20 candidates.

Thank you

On Assigning Implicit Reputation Scores in an Online Labor Marketplace

Similar presentations

Presentation on theme: "On Assigning Implicit Reputation Scores in an Online Labor Marketplace"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

On Assigning Implicit Reputation Scores in an Online Labor Marketplace

Similar presentations

Presentation on theme: "On Assigning Implicit Reputation Scores in an Online Labor Marketplace"— Presentation transcript:

Similar presentations

About project

Feedback