YTMND:Weighted Voting

From YTMND
Revision as of 21:25, September 20, 2006 by Inkdrinker (Talk | contribs)

Jump to: navigation, search

Come up with a reasonable weighting algorithm that wont enrage half the site.

-Max

This is the place where we attempt to do just that.

Mewchu11's second algorithm

This one takes max's thoughts on the subject more to heart.

Theory in writing, watch this space


Pilcrow's algorithm

Site creation and site appreciation have only a tentative relationship to one another, and seniority and past success do not necessarily correlate to the ability to rate a site intelligently. An algorithm based on these will not necessarily improve the ratings, and it has a secondary effect of creating an elite.

If voting must be made more intelligent, make it intelligent by attacking the extremist voters. The strongest way to effect a site's rating is to 1 or 5 star it, but statistically, someone who is really participating in the community will rate sites throughout the spectrum. In fact, the MAJORITY of their votes should be 2-4 stars with only the very worst and very best sites getting 1 or 5 star ratings. So, track the voting history of people supplying the votes, the more their votes resemble a bell curve from 1 to 5, the more weight their votes get. Downvoters and upvoters will least resemble the bell curve and will therefore have their votes discounted the most.

For a second factor, give people higher ratings if in addition to rating a YTMND, they also included comments, as this gives the author further feedback on how to please their audience. That factor should carry significantly less weight than the bell curve factor however.

One could also add factor for other forms of community participation - such as forum posting, wiki posting, and site posting. These should be very small elements though, since one should not want to encourage people to pad any of those counts.

A forumula similar to 0.8 * Bell_Curve_Metric + 0.1*Comment_Frequency + 0.05*Forum_&_Wiki_Posts + 0.05*Sites_Submitted would stike me as about right.


Mewchu11's First algorithm

Personally I would love to see weighted voting take back YTMND from the NARV's but at the same time keep them happy. Therefore I am proud to unveil my version of the weighted algorithm. As is anything on a wiki, this page is open to editing. For the sake of logic, suggestions to edit this algorithm or any other should first be taken to the edit page. Also, if you were going to ask, yes I am going to make a simple YTMND on this sometime tomorrow for the sake of people adding their opinion.

Variables

A: Number of sites with a rating higher than 3 after at least 10 votes (if this is 0, 1 or 2 we treat the variable as 3)

B: Average Rating of all sites (if this is Undefined, we treat the variable as 3)

C: Average Rating of other sites

D: Donations

E: Days since join

The Algorithm

Voting Power = [(A / 3) * (B / 3)] * [Ca * Da] + Ea

Finding Ca from C

Ca is determined not as a raw value, but as a deviation from 3.

If 1.0<C<1.5 or 4.5<C<5 then Ca = .5

If 1.5<C<2.0 or 4.0<C<4.5 then Ca = .75

If 2.0<C<2.5 or 3.5<C<4.0 then Ca = 1

If 2.5<C<3.5 then Ca =1.25

Special case in Ca

If user has less than 10 votes overall, Ca = 1 regardless of the above

This promotes fair and average voting, and moreover cripples rampant upvoters and downvoters.

Finding Da from D

D is an outside modifier that scales the whole equation based on how long the user has been around. For the sake of simplicity when I say “Month” we can approximate it to 30 days.

< 1 month = 1

1 to 2 = .9

2 to 4 = .8

4 to 6 = .75

6 to 8 = .8

8 to 9 = .9

9 months to 1 year = 1

1 year to 1.5 = 1.25

1.5 to 2.0 = 1.5

2.0 to 2.5 = 1.75

> 2.5 years= 2

This works to both give the newbie a chance but punishes slackers who don’t improve their scores in other ways (by making good sites or buying their way in). The four to six month marks the worst of it, after that the punishment ceases and they are rewarded with a gradual increase in voting power. Once senior ranks (a year onward) are achieved voting power will increase rather dramatically with time, by this point the user should have a good eye of what’s good after all.

Finding Ea from E

D is a vote increase by a fixed donation rate

1 D (a buck after paypal) = .2 DA

aka 5 bucks weighs your vote up a single vote. I would suggest a cap somewhere as so people can’t become insanely weighted (I’d say the limit would be 25 bucks/+5vote strength) This function gives people a reason to donate other than out of love for YTMND.

Note of Incompleteness

I am aware that this might seem off to some people, and to them I say TELL ME. I don't think it's perfect, just usable. I want to see what other people are thinking, otherwise I wouldn't have put it on the wiki! One big thought I'm having is if there should be a maximum vote weight, and if so what if...


Wallet's algorithm

(first draft, more refinement will come in due time) Wallet 18:04, September 20, 2006 (CDT)

This method puts a smaller emphasis on vote strength and a greater one on vote skew. The algorithm holds the following true:

  • The average vote for the site as a whole should be 3; if the average is, say, 4, then the ratings 1-4 are precise, but the ratings 4-5 are vague.
  • Similarly, if someone is a downvoter, their high votes should have more weight than their low votes, and an upvoter's low votes more weight than their high votes.
  • The tendency for many members to vote "hot-or-not" makes a mockery of the famous 5-star system, and shouldn't be rewarded.
  • No matter what measures are taken on a person's vote, a vote lower than the site's average must never raise the average, and similarly a vote higher than the site's average must never lower the average.

The following proposal differs from Mewchu11's First in that it puts little weight on a member's supposed contribution. It differs from Pilcrow's in that skew is not punished across the board, but rather compensated for. Like all algorithms on this page, it should be evaluated on its logical merits, not on individual number values which can be changed without altering the spirit of the algorithm.

Assumptions

This algorithm assumes it would take way too much computing power to retroactively change the weights of votes already made without the user manually re-voting, or look up all the data on all users that have ever voted on a page. The only things assumed to be known by the algorithm before internal math are:

  • The number of each vote 1, 2, 3, 4, or 5
  • The number of votes on the site being appraised
  • The average rating on the site being appraised

With this info we can derive a lot of information, such as how much a vote in an unweighted system should change the average: simply take the number of votes N and the rating R and the result based on a vote V would be

(NR + V)/(N+1)

This is the assumed system, since it can't be a significant distance from the true current method, and so an algorithm based on it requires less work on Max's part to translate into a final implementation.

The process of weighting votes in this algorithm is to change the value V such that the above fraction outputs a different average vote. If we want to give extreme weight to votes, giving them a V of 0 or 6 is not unbecoming, as long as we limit the bounds of the final rating: a vote weighted to 0 might bring a one-vote-of-5 page down to 2.5, but can't bring the score below 1.00.

Exploit Prevention

General weighting regardless of skew is really a method of exploit prevention, either preventing a new member who has never seen other sites from voting unfairly, or a sock puppet account from upvoting friends and downvoting enemies. I believe no purely mathematical method can prevent such exploits and it requires diligence on the part of the administration or automated methods outside the scope of vote weighting; however, a good catch-all method is to give an automatic -50% weight vs. the average to all votes made by accounts that have not both existed for 1 month and logged 100 votes, or some other measure of activity:

(2NR + V)/(2N+1)

The following is also vulnerable to moderate exploitation with very careful and diligent vote-changing; though this is a small flaw, it should be compensated for. The easiest method is to limit changing votes. Vote-changing could easily be limited to 25 per day, but I know that I for one wouldn't have much trouble using this up, as I often change votes once or twice even on my first visit to a page as I reconsider my rating. Logging vote times henceforth might be useful; then vote-changing could be unlimited for sites a user has never rated before today, but limited to 25 per day on sites the user first rated more than a day ago.

An Average of 3

The entire pholisophy of this algorithm is that the natural state for any given page is a rating of 3.00; that is, all votes on YTMND whatsoever, after weighting, should together average to 3.00. It's important to recognize that five votes of five stars doesn't make 25 stars; we're dealing with an average, so we must calibrate our scale so that the median vote is our "zero".

Part 1: Tending to 3

As such, a user's votes should average 3.00 after weighting. As an example, assume a user's current average vote after weighting is less than 3. In this case, less weight is given to votes below the user's average, and more weight to those above.

Any method you imagine will work fine, but the following is a simple one derived from the requirements: Let s equal the site's current rating, and v equal a user's vote. Regardless of how we want to weigh it,

If v<s, s must not increase
If v=s, s must not change
If v>s, s must not decrease

We can safely ensure this by only weighting the difference, d, where d=v-s, by multiplication of a weight, w, which is non-negative.

I'm still working on the exact values, but the gist is that your final weighted vote, V, looks a bit like:

V = v + dw "The final counted vote is the original vote, plus the difference (d) times w"

I don't yet have a nice equation to determine w, but you can think of it in general terms based on wether a vote is inside or outside a user's average vote a. A vote v is here considered "outside" a if it is both further from 3.00, and on the same side of 3.00 as a, while an "inside" vote is either closer to 3.00 or on the opposite end of 3.00 relative to a.

E.g. if a user's average vote is 3.5, a 4 or 5 is "outside" and a 1, 2, or 3 "inside". Feel free to edit this if you think of clearer terminology.

So, consider the weight w as such:

If v is outside a, w<1
If v is inside a, w>1
If v=a, w=1

Note this means that if a person's average vote is 2, and the person votes nothing but 2, the weighted vote will be 2. The algorithm presumes that virtually no one only votes one rating 2-4 absolutely all the time, and so given enough time, the weighting will tend towards correct. A more robust system would, naturally, be more complicated. As for people voting only 1 or 5 all the time, this is covered by the following section:

Part 2: Punishing Binary Votes

Most users are divided into two camps: those that reserve 1s and 5s for more extremely bad/good pages thus keeping their votes in a logical perspective of preference, and those who only vote 1s and 5s, thus giving their vote more power against the average. The logic behind only voting 1 or 5 is that one can better force the average to approach the rating the user would actually give the site; thus users attempt to exert as much dictatorship as possible on the democratic vote.

To discourage such behavior, the best way is to make it a moot point. First let's forget the "tending to 3" section and assume that 2, 3, and 4 have a constant application on the average (as 2, 3, and 4). Let 1' be the number of 1-star votes, 2' be the number of 2-star votes, and so on. For a given user, the actual recorded numbers would be:

For a vote of one,
2 - 2^-(1'/2' - 1) "Two, minus two to the power (1'/2' minus one)"
For a vote of five,
4 + 2^-(5'/4' - 1) "Three, plus two to the power (5'/4' minus one)"

(Note that these require a non-zero 2' and 4' - I suggest minimum values of 1)

The end result of this is, if you have ten times as many 1s as 2s, your 1s act as 2s, and if you have ten times as many 2s as 1s, your 1s act as 0s. (There's a little rounding involved, don't any math nuts get on my case.)

This satisfies three conditions:

  • If you almost never vote extreme, they will be worth even more than their unweighted numerical amount.
  • If you give an even number to all votes 1-5, they will all act accurding to their unweighted numerical amount.
  • If you only vote 1 or 5, after you vote 10-20 times, all your subsequent votes will act as though you're only ever voting 2 or 4.

The result of this is that the necessary evil of only voting 1 or 5 for maximum weight has been eliminated, since maximum weight is achieved only with moderation in the use of extreme votes.

Final Result

Though w has as yet been poorly-defined, we can now give a final weight by slapping the two methods together crudely:

V = v + dw
If v = 1 or 5, apply the following:
For a vote of one,
V' = V + 1 - 2^-(1'/2' - 1)
For a vote of five,
V' = V - 1 + 2^-(5'/4' - 1)

V (or V' as necessary) can now be applied to (NR + V)/(N+1) as-is to change the page's average, or additionally weighted with other considerations toward longevity, community contribution, number/ratings of the user's own pages, donation, etc.


???'s algorithm

- Have a completely different way of making weighted voting work? Please feel free to include it here!


The problem with normal distribution

Most of the algorithims seem to suggest that votes should be normally distributed with a median of 3, but such a distribution also assumes that there are an equal number of sites worthy of 5 stars and 1 stars. Is that really the case? At all?

Even max, whose voting is mostly restricted to random looks at the front page, has an average under 3 and like 60% more 1s than 5s. To end up with an average of 3 or as many 1s as 5s, one really has to go out of their way to avoid sites they suspect might be bad, or just refrain from voting on the majority of them.

If our voting system really assumes that there are an equal number of awful sites as there are good ones (rather than what should be apparent - that the chaff outweights the cream by orders of magnitude,) the system will be just one more factor contributing to the already over-inflated scores sites are getting and further discourage low votes, as users afraid of screwing up their weighting will bump votes up to 2s and 3s.

To recap, a 3 median punishes users who visit more than their preferred pockets of the site and vote accordingly, and also artificially inflates the scores of the mass of sites which deserve 1 and 2 stars. Each instance of pointing out a bad site will now trigger not only the unavoidable revenge downvoting, but a loss in that user's vote weight.

Perhaps nudging the median to 2 would be called for?--Inkdrinker 22:25, September 20, 2006 (CDT)