Issue Details (XML | Word | Printable)

Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: DmitryRyashchentsev
Reporter: PRE
Votes: 0
Watchers: 0

If you were logged in you would be able to see more operations.

Implement Time Bound Equity Statistics

Created: 01/Mar/10 04:33 PM   Updated: 31/May/10 01:51 PM   Resolved: 31/May/10 01:51 PM
Component/s: CeQ math
Affects Version/s: Milestone 1.2, WebSynergy Plugin, Liferay plugin, Milestone 1.1, Further Updates, Demo for JavaOne, CEQ Documentation , Sunspace
Fix Version/s: Milestone 1.4

Time Tracking:
Issue & Sub-Tasks
Issue Only
Not Specified


  • Sub-Tasks:
  • All
  • Open

 Description  « Hide

*1. calculate dally,weekly,monthly equity statistics per info, per person and per tag during aging process and store it data in cache tables.
2. Provide function which can create statistics and caches from scratch based on the CeQ activity log.
3. Provide webservice API to access statistical data (more details will be provided by PRE/MAX)
4. Implement access control (need to define details)

Services API is on

DmitryRyashchentsev added a comment - 01/Mar/10 04:48 PM

My proposal was keep in cache tables: dayly statistics for last week, weekly per last month, then monthly.
That significantly reduce the amount of stored data.

PRE added a comment - 15/Mar/10 04:50 PM

1. calculate dally,weekly,monthly equity statistics per info, per person and per tag after aging process and store it data in cache tables

DmitryRyashchentsev added a comment - 15/Mar/10 06:28 PM

From our discussion:

Phase 1:
Provide dayly (just after aging and materialized valies recalculation) statistics of CQ and PQ for:
1 infos,
2 people,
3 tags,
4 people-tags,
5 info-tags,
6 country-tags

(We can access these data by API and by direct DB access)

Phase 2:
Provide Web services to access the data

Phase 3:
Provide widgets:
1 simple form to get Equity per given period
2 top list per givem period
3 graph
4 trend analysis

Phase 4:
Provide admin interface + provide replacing old low-granular statistics by big-granular one

PRE added a comment - 21/Apr/10 01:43 PM

add service to recalculate all caches with a from and to date option

DmitryRyashchentsev added a comment - 28/Apr/10 06:27 PM

Provided dayly statistics calculation for PQ, CQ:
1. info
2. people
3. tags
4. person-tag

We really do not need keep statistics info-tags as we have statistics for info and info-tag filter, so we can easily get info0tag statistics with simple join.

Now statistics process is run by dayly scheduller after aging and materialized values recalculation
These processes can be also run manually by:

provided http://localhost:8088/ceq-ws/jersey/math/aging?type=0 - aging
provided http://localhost:8088/ceq-ws/jersey/math/aging?type=1 - materialized values recalculation
provided http://localhost:8088/ceq-ws/jersey/math/aging?type=2 - statistics
provided http://localhost:8088/ceq-ws/jersey/math/aging - run all 3 processes

PRE added a comment - 07/May/10 02:49 PM

Dima - How do I know that the statistics was calculated ? I looked at the statistics-* db tables and they are all empy

DmitryRyashchentsev added a comment - 07/May/10 02:57 PM

Now statistics-* table is the only way to test.
But statistics is calculated only for past days (till yesterday)

PRE added a comment - 07/May/10 03:02 PM - edited

hmm - what if we import a feed which has activities which are a few month old ? Actually this is what I am doing right know e.g. importing SunSpace ATOM feed for the last few month. Mayb we need a parameter to define the aging start date


DmitryRyashchentsev added a comment - 07/May/10 03:47 PM - edited

That should work for activity older then a day.
Can you please write me the feed URL and other parameters.

Site feed = checked
Feed Type = activitystream

You can change days and pagesize

DmitryRyashchentsev added a comment - 08/May/10 12:54 PM

The problem was the first statistics calculation did for previous day only (the following statistics starts from date of last calculated one).
In this feed there were no activity for May,6.

I changed behaviour, now the first statistics is calculated for the whole period till previous day.
I did not risk to do it initially because it can take much time: for example above I got statistics for 169 days - that took ~9mins (3 secs per day).

Now the progress of statistics calculation can be observed only in the server.log, the aging?type=2 only starts the process.