Quote from KenderMar 06, 2011 - 21:52:46
I also never liked having challenge count entering the equations, but for a different reason, I feel it rewards people for doing the same repetitive things over and over. Which in my opinion is what happens on sites with a high challenge count.
Some sites, yes, probably so. I don't think this is true of all high challenge count sites-- for example, I don't think it is true of Hacker.org (~283 challenges). As far as repetition goes, I'd say that where you find the biggest repetition of challenges is in the easier 10-15% of the challenges on a site, since there does seem to be a standard set of 'starter' challenges. I can't say that I've noticed a lot of repetition on high challenge count sites-- some, yes, but a lot, no. This, of course,
is a good reason to score the early percentages lower, but I do believe that the difficulty values passed to WeChall by most sites adequately compensate without further manipulation by WeChall's scoring mechanisms.
Quote from KenderMar 06, 2011 - 21:52:46
WeChall's scoring aims to reflect the amount or level of skill someone has. Not how much time they are willing to spend doing very similar things.
I absolutely agree with that goal.
Quote from KenderMar 06, 2011 - 21:52:46
"points per challenge" is never going to work with that.
I don't think this is as true as you do.
Quote from KenderMar 06, 2011 - 21:52:46
And the difficulty of individual challenges can never be taken into account here.
The difficulty of individual challenges can probably never fully be taken into account, but that information is reflected, mostly, in the scores sent to WeChall. You say below that this doesn't matter, but I'm not sure how or why it doesn't. True, WeChall has to depend upon the linked sites difficulty calculation but I can't think of many sites where this is badly wrong. However, WeChall can infer challenge difficulty using some of the information sent in from the linked sites.
WeChall can also infer the difficulty of a site as a whole from some of the user statistics and also from the 'Dif' column of the 'Sites' table, which is subjective but looks about 80-85% correct to me.
Quote from KenderMar 06, 2011 - 21:52:46
So yes, solving a challenge of difficulty X on a site with a lot of challenges is going to be worth less than solving a challenge of difficulty X on a site with less challenges. This is fair when you think about it for a bit and factor in the choice of sites to pay on and the challenges to choose from on each site.
I don't understand how this is fair or how it does anything but encourage the playing of easy sites and easy challenges on those sites. It also discourages playing the bigger sites, which for one isn't really fair to those larger sites and two discourages, at least in some case, playing harder challenges.
Quote from KenderMar 06, 2011 - 21:52:46
Try to shift your thinking from being rewarded for solving a challenge to being rewarded for solving a site.
I understand. This is part of why I used to think the challenge count weighting made some sense. I don't think so anymore. It doesn't work well when some sites have 80 challenges and some have 250 and some have 2000+.
Quote from KenderMar 06, 2011 - 21:52:46
I recomend against using the user-voted "dif" column. It is extremely unreliable. For some people it represents how difficult it is, for others how easy it is and for yet others how balanced the difficulty is. Also, most sites have no more than a handful of votes.
I know. That is the worst part of my idea because it depends upon subjective evaluations of difficulty and because voting is sparse. Still, when I look at those 'Dif' values they do not look very far from true. You could probably solve the sparse voting issue by requiring a vote at, say, 75% solved or so, but that is pretty draconian and unfriendly. It is probably better to beg for voluntary voting.
Quote from KenderMar 06, 2011 - 21:52:46
The "average" column is a much better indicator of the difficulty of completing a site.
Of the sites I've played in earnest, I'd say the "average" column is a horrible representation of difficulty-- much worse than the subjective 'Dif' column. The 'Average' columns makes Ma's and Electric mid-range difficulty sites which is very wrong and it makes Hax.tor the easiest, which is also very wrong. And SPOJ is the hardest, I'd wager, only because it is so huge. My guess is that it should probably be toward the top of the hard sites but not the absolute most difficult, judged by difficulty not by sheer mass.
I wouldn't be opposed to using both columns though.
Quote from KenderMar 06, 2011 - 21:52:46
Whether or not a site internally also uses difficulty for scoring (like rosecode or hackquest) makes little difference. 0% complete is still 0 points and 100% complete is still max points. It's just the curve that's a it different.
That doesn't make sense. A challenge on Rosecode that was a difficulty there of '4' was worth 122 point here. A challenge there of difficulty '25' was worth 649 points. And I solved these back to back so the difference isn't with WeChall's weighted scoring. I didn't gain a lot of points and thus get the point bonus that high percentages give. That difficulty difference in the form of site percentage or challenge score (I'm not exactly sure) was passed to WeChall by Rosecode. And Rosecode is not the only place where I've noticed this. Conversely, CSTutoring does not internally rate by difficulty and you can see a very smooth point per-challenge point change as you work through the site. Some sites do pass internal difficulty calculations to WeChall. That has to matter if you want fair and if you want to approximate a measurement of 'skill'.
Quote from KenderMar 06, 2011 - 21:52:46
My solution:
I propose we ignore all challenges over 200 per site for the score. That should take care of the problems you described.
It would be great if you could explain this in more detail. As is, I don't understand it at all? You suggest that 200 challenges is the most that will ever be scored? Aren't you
de facto cutting the harder challenges right out of the ranking? At least in some cases, but not all, you could intentionally choose the harder challenges first but why? This solution seems pretty at odds with the idea of measuring skill.
It seems like this would also weight the sites very strangely since all sites with 200 or more challenges are suddenly equivalent.