Paris Large Scale Sakai

People expressing interest in issues of Large Scale Sakai and / or code reviews

Person	Affiliation	Interests	email address
David Haines	University of Michigan	information sharing, code reviews, best practices	dlhaines@umich.edu
Former user (Deleted)	Unicon, Inc.	large scale sakai performance and optimization	holdorph@unicon.net
Ray Davis	U. California, Berkeley	information sharing, code reviews, best practices	ray@media.berkeley.edu
Alan Berg	University of Amsterdam	Information sharing, Automation of central QA	a.m.berg@uva.nl
Former user (Deleted)	Unicon, Inc.	performance testing framework	jlewis@unicon.net
Former user (Deleted)	Université Pierre et Marie Curie	information sharing, code reviews, best practices, Scorecard, QA, testing	jean-francois.leveque@upmc.fr
Sonette Yzelle	University of South Africa	Information sharing, code reviews, best practices	syzelle@unisa.ac.za
Former user (Deleted)	Indiana University	Information sharing, code reviews, best practices	wagnermr@iupui.edu
Former user (Deleted)	Arizona State University	information sharing, code reviews, best practices	josh@asu.edu

Comments from meeting

Debate on a more specific type of Jira for performance

Is there interest in code review
OSP bug feest

http://qa1-nl.sakaiproject.org/codereview/bug_dashboard/osp_test_case.html

Separate branch
In bar debugging

Driven by people that have problems in production
Indiana run findbugs on some code has cleaned it up

Performance issue:

Gradebook integration
Chat
Forums
Presence

Gilgamesh will peer up on grade books and static code review.

Good code review through emails.
Jira's potentially triggers code review
Good Idea, but how to get resources
Good to avoid unpleasant surprises
Is there room for code review as a criteria for merging?
Bug fix has to be tested than someone other.

Assuming that it is running elsewhere is dangerous.
Tool for improving stability not directly performance

How to get things done?

Use statistic as ammo
How to deal with lots of trivial bugs scattered across the whole of the code base.
Restricted rules
Synchronization issues
One time clean up
Simple process for activating one time clean up.
String literals first
Commit boundaries don't have meaning.
Bug squashing of trivial rules as a project.
Commit trigger
Core services in Sakai without owners

Should not be on a shame list for too long.
Finalizers Jira still not done.
How to make stick to come down heavier.

Organize around events?
Developers have a different time table
Choose rules (smallest set) at the BOF

Filtered list - And reduce/Added based on anonymous
Sakai Bogobugs - and an automated website for rating tools.
More liberal on commit accesses?

Plan

Alan + Gill + David + make list of rules
Rules open to feedback for three weeks
Build a comparison site for Sakai Bogobugs

original email

Hi,

Prior to the session in Paris I'd like to jump start some discussion about the issues of successfully using Sakai in installations with really large numbers of users. At Michigan we've run into issues in production with code that works perfectly well until there are thousands of users or lots of data. It would be a lot more efficient and less embarrassing to find ways to write code that avoid the problems altogether.

Developers are usually responsive in fixing these issues but often there are issues that could have been avoided altogether by recognizing during development that certain approaches are simply not going to work at large scales. However it isn't feasible to expect that every developer will all have the tools and the experience to discover all those situations. It's unrealistic to assume that a developer can test a chat with hundreds of concurrent users, or thousands of resources, or hundreds of people belonging to a single site. Recognizing and adhering to best practices based on past experiences can go a long way to avoiding some issues, but relying just on a best practices document would not be have a large impact.

I'm proposing a few things that can be done to help minimize scale problem in the future.

Document issues that have come up. We should make sure to notice the evidence and lessons provided by events that have been resolved.
Develop some guidelines for practices that specifically help avoid scaling issues. It should be easy for a developer to look up what has and hasn't worked in Sakai before.
Extend Alan Berg's static analysis, as possible, to flag these questionable patterns. Alan has shown that a number of questionable practices can be found by static analysis. A small list of patterns customized for Sakai would help avoid specific issues without the churn of dealing with may false positive hits.

In addition to these general steps those institutions that have a large stake in Sakai working at scale could collaborate to take more proactive steps. Code reviews are shown to be a very effective way to eliminate bugs. A few developer hours spent on code review is a good investment if it saves a few thousand people from sitting around wondering how long it will be until their CLE is working again. Having focused code reviews that address on specific scaling issues, and making public the results of those code reviews, will both help interested institutions decide what code is safe to run and give developers guidance on how to write code that is likely to be adopted. Collaborating on those code reviews will help save duplicated effort.

Nothing in these suggestions should be read as suggesting mandated coding practices. Anyone can write code anyway they want. On the other hand institutions running code should be able to know if the tools they want to use are written in a way to give them confidence that they will work successfully.

Any comments are welcome.

Dave