1
0
Fork 0
mirror of https://we.phorge.it/source/phorge.git synced 2024-12-12 00:26:13 +01:00
Commit graph

14 commits

Author SHA1 Message Date
epriestley
f91bef64f1 Stack chart functions in a more physical way
Summary:
Ref T13279. See that task for some discussion.

The accumulations of some of the datasets may be negative (e.g., if more tasks are moved out of a project than into it) which can lead to negative area in the stacked chart.

Introduce `min(...)` and `max(...)` to separate a function into points above or below some line, then mangle the areas to pick the negative and positive regions apart so they at least have a plausible physical interpretation and none of the areas are negative.

This is presumably not a final version, I'm just trying to produce a chart that isn't a sequence of overlapping regions with negative areas that is "technically" correct but not really possible to interpret.

Test Plan: {F6439195}

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim

Maniphest Tasks: T13279

Differential Revision: https://secure.phabricator.com/D20506
2019-05-22 05:40:39 -07:00
epriestley
da40f80741 Update PhabricatorLiskDAO::chunkSQL() for new %Q semantics
Summary:
Ref T13217. This method is slightly tricky:

  - We can't safely return a string: return an array instead.
  - It no longer makes sense to accept glue. All callers use `', '` as glue anyway, so hard-code that.

Then convert all callsites.

Test Plan: Browsed around, saw fewer "unsafe" errors in error log.

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13217

Differential Revision: https://secure.phabricator.com/D19784
2018-11-13 08:59:18 -08:00
epriestley
2fb266de7c Fix some of the most obvious bugs in fact generation from Maniphest tasks
Summary:
Depends on D19121. Ref T13083. Group transactions and show groups in the debugging view.

Fix some of the most obvious issues with fact generation:

  - No more 0-point facts.
  - Engine can now generate at least one of every type of fact.

Test Plan: Generated facts, viewed them in the debugging view, fact generation largely appeared to align with reality. No more "no facts in storage" facts.

Subscribers: yelirekim

Maniphest Tasks: T13083

Differential Revision: https://secure.phabricator.com/D19122
2018-02-19 12:07:28 -08:00
epriestley
e3a1a32444 Extract count/point data from tasks in Fact engines
Summary:
Depends on D19119. Ref T13083. This is probably still very buggy, but I'm planning to build support tools to make debugging facts easier shortly.

This generates a large number of datapoints, at least, and can render some charts which aren't all completely broken in an obvious way.

Test Plan: Ran `bin/fact analyze --all`, got some charts with lines that went up and down in the web UI.

Subscribers: yelirekim

Maniphest Tasks: T13083

Differential Revision: https://secure.phabricator.com/D19120
2018-02-19 12:06:03 -08:00
epriestley
0dee34b3fa Make Facts more modern, DRY, and dimensional
Summary:
Ref T13083. Facts has a fair amount of weird hardcoding and duplication of responsibilities. Reduce this somewhat: no more hard-coded fact aggregates, no more database-driven list of available facts, etc. Generally, derive all objective truth from FactEngines. This is more similar to how most other modern applications work.

For clarity, hopefully: rename "FactSpec" to "Fact". Rename "RawFact" to "Datapoint".

Split the fairly optimistic "RawFact" table into an "IntDatapoint" table with less stuff in it, then dimension tables for the object PHIDs and key names. This is primarily aimed at reducing the row size of each datapoint. At the time I originally wrote this code we hadn't experimented much with storing similar data in multiple tables, but this is now more common and has worked well elsewhere (CustomFields, Edges, Ferret) so I don't anticipate this causing issues. If we need more complex or multidimension/multivalue tables later we can accommodate them. The queries a single table supports (like "all facts of all kinds in some time window") don't make any sense as far as I can tell and could likely be UNION ALL'd anyway.

Remove all the aggregation stuff for now, it's not really clear to me what this should look like.

Test Plan: Ran `bin/fact analyze` and viewed web UI. Nothing exploded too violently.

Subscribers: yelirekim

Maniphest Tasks: T13083

Differential Revision: https://secure.phabricator.com/D19119
2018-02-19 12:05:19 -08:00
epriestley
52a29be70d Introduce a request cache mechanism
Summary:
Ref T8424. This adds a standard KeyValueCache to serve as a request cache.

In particular, I need to cache Spaces (they are frequently accessed, sometimes by multiple viewers) but not have them survive longer than the scope of one request.

This request cache is explicitly destroyed by each web request and each daemon request.

In the very long term, building this kind of construct supports reusing PHP interpreters to run web requests (see some discussion in T2312).

Test Plan:
  - Added and executed unit tests.
  - Ran every daemon.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T8424

Differential Revision: https://secure.phabricator.com/D13153
2015-06-04 17:27:31 -07:00
Joshua Spence
36e2d02d6e phtize all the things
Summary: `pht`ize a whole bunch of strings in rP.

Test Plan: Intense eyeballing.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: hach-que, Korvin, epriestley

Differential Revision: https://secure.phabricator.com/D12797
2015-05-22 21:16:39 +10:00
Joshua Spence
62dfcd1e55 Fix the visibility of PhutilDaemon::run methods
Summary: Ref T6822. This method is only called from `PhutilDaemon::execute()` and can be made `protected`.

Test Plan: See D11404.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: Korvin, epriestley

Maniphest Tasks: T6822

Differential Revision: https://secure.phabricator.com/D11405
2015-01-16 06:59:29 +11:00
epriestley
9309723ac4 Send graceful shutdown signals to daemons in Phabricator
Summary:
Fixes T5855. Adds a `--graceful N` flag to `phd stop` and `phd restart`.

`phd` will send SIGINT, wait `N` seconds, SIGTERM, wait 15 seconds, and SIGKILL. By default, `N` is 15.

Test Plan:
  - Ran `bin/phd debug ...` and used `^C` to interrupt daemons. Saw graceful shutdown behavior, and abrupt termination on multiple `^C`.
  - Ran `bin/phd start`, `bin/phd stop` and `bin/phd restart` with `--graceful` set to various things, notably `0`. Saw graceful shutdowns on the CLI and in the web UI. With `0`, abrupt shutdowns.

Reviewers: btrahan, hach-que

Reviewed By: hach-que

Subscribers: epriestley

Maniphest Tasks: T5855

Differential Revision: https://secure.phabricator.com/D10228
2014-08-11 20:18:31 -07:00
Joshua Spence
0a62f13464 Change double quotes to single quotes.
Summary: Ran `arc lint --apply-patches --everything` over rP, mainly to change double quotes to single quotes where appropriate. These changes also validate that the `ArcanistXHPASTLinter::LINT_DOUBLE_QUOTE` rule is working as expected.

Test Plan: Eyeballed it.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley, Korvin, hach-que

Differential Revision: https://secure.phabricator.com/D9431
2014-06-09 11:36:50 -07:00
vrana
ef85f49adc Delete license headers from files
Summary:
This commit doesn't change license of any file. It just makes the license implicit (inherited from LICENSE file in the root directory).

We are removing the headers for these reasons:

- It wastes space in editors, less code is visible in editor upon opening a file.
- It brings noise to diff of the first change of any file every year.
- It confuses Git file copy detection when creating small files.
- We don't have an explicit license header in other files (JS, CSS, images, documentation).
- Using license header in every file is not obligatory: http://www.apache.org/dev/apply-license.html#new.

This change is approved by Alma Chao (Lead Open Source and IP Counsel at Facebook).

Test Plan: Verified that the license survived only in LICENSE file and that it didn't modify externals.

Reviewers: epriestley, davidrecordon

Reviewed By: epriestley

CC: aran, Korvin

Maniphest Tasks: T2035

Differential Revision: https://secure.phabricator.com/D3886
2012-11-05 11:16:51 -08:00
epriestley
f0af273165 Add FactCursors and application fact datasources
Summary:
  - Add PhabricatorApplication. This is a general class that I have grand designs for, but used here to allow applications to provide objects for analysis by the facts appliction.
  - Add FactCursors, to keep track of where iterators are.
  - Make the daemon do something sort of useful.
  - Add `bin/fact cursors` for showing and managing objects and cursors.
  - Add some options to `bin/fact analyze`.

Test Plan:
  - `bin/fact cursors`, `bin/fact cursors --reset DifferentialRevision`, `bin/fact cursors --reset X`
  - `bin/fact analyze`, `bin/fact analyze --all`, `bin/fact analyze --iterator DifferentialRevision --skip-aggregates`
  - `bin/phd debug fact`

Reviewers: vrana, btrahan

Reviewed By: btrahan

CC: aran

Maniphest Tasks: T1562

Differential Revision: https://secure.phabricator.com/D3098
2012-07-30 10:43:49 -07:00
epriestley
486f7c1e8e Add aggregated facts to the Facts application
Summary:
Some facts are aggregations of other facts. For example, we may compute how many times each macro is used in each object as a "raw fact":

  Dnnn uses macro "psyduck" 6 times.

But we want to present this data in aggregate form, e.g. "order macros by popularity". We can do this at runtime and it probably won't be too awful a query, but we can also aggregate it cheaply:

  Macro "psyduck" is used 3920 times across all objects.

...and then do a query like "select macros ordered by usage".

"Aggregate" facts support facts like this. The aggregate facts I've implemented are:

  - Count of all objects.
  - Count of objects of type X.
  - Last time facts were updated.

These clearly fit the "aggregate" facts template well. I'm not 100% sure macros do. We can use this table to answer a question like "What are the most popular macros, ordered by use?" We can also use it to answer a question like "What are the most popular macros in the last 6 months?", if we build a specific fact for that. But we can't use it to answer a question like "What are the most popular macros between times X and Y?". Maybe that's important; maybe not.

This seems like a good fit for at least some types of facts.

I'll de-magic the keys a bit in the next diff.

Test Plan: Ran the engines and got some aggregated facts about other facts.

Reviewers: vrana, btrahan

Reviewed By: vrana

CC: aran

Maniphest Tasks: T1562

Differential Revision: https://secure.phabricator.com/D3089
2012-07-27 13:46:01 -07:00
epriestley
7c934e4176 Add a basic "fact" application
Summary:
Basic "Fact" application with some storage, part of a daemon, and a control binary.

= Goals =

The general idea is that we have various statistics we'd like to compute, like the frequency of image macros, reviewer responsiveness, task close rates, etc. Computing these on page load is expensive and messy. By building an ETL pipeline and running it in a daemon, we can precompute statistics and just pull them out of "stats" tables.

One way to do this is just to completely hard-code everything, e.g. have a daemon that runs every hour which issues a big-ass query and dumps results into a table per-fact or per fact-group. But this has a bunch of drawbacks: adding new stuff to the pipeline is a pain, various fact aggregators can't share much code, updates are slow and expensive, we can never build generic graphs on top of it, etc.

I'm hoping to build an ETL pipeline which is generic enough that we can use it for most things we're interested in without needing schema changes, and so that installs can use it also without needing schema changes, while still being specific enough that it's fast and we can build useful stuff on top of it. I'm not sure if this will actually work, but it would be cool if it does so I'm starting pretty generally and we'll see how far I get. I haven't built this exact sort of thing before so I might be way off.

I'm basing the whole thing on analyzing entire objects, not analyzing changes to objects. So each part of the pipeline is handed an object and told "analyze this", not handed a change. It pretty much deletes all the old data about that thing and then writes new data. I think this is simpler to implement and understand, and it protects us from all sorts of weird issues where we end up with some kind of garbage in the DB and have to wipe the whole thing.

= Facts =

The general idea is that we extract "facts" out of objects, and then the various view interfaces just report those facts. This change has on type of fact, a "raw fact", which is directly derived from an object. These facts are concerete and relate specifically to the object they are derived from. Some examples of such facts might be:

  D123 has 9 comments.
  D123 uses macro "psyduck" 15 times.
  D123 adds 35 lines.
  D123 has 5 files.
  D123 has 1 object.
  D123 has 1 object of type "DREV".
  D123 was created at epoch timestamp 89812351235.
  D123 was accepted by @alincoln at epoch timestamp 8397981839.

The fact storage looks like this:

  <factType, objectPHID, objectA, valueX, valueY, epoch>

Currently, we supprot one optional secondary key (like a user PHID or macro PHID), two optional integer values, and an optional timestamp. We might add more later. Each fact type can use these fields if it wants. Some facts use them, others don't. For instance, this diff adds a "N:*" fact, which is just the count of total objects in the system. These facts just look like:

  <"N:*", "PHID-xxxx-yyyy", ...>

...where all other fields are ignored. But some of the more complex facts might look like:

  <"DREV:accept", "PHID-DREV-xxxx", "PHID-USER-yyyy", ..., ..., nnnn> # User 'yyyy' accepted at epoch 'nnnn'.
  <"FILE:macro", "PHID-DREV-xxxx", "PHID-MACR-yyyy", 17, ..., ...> # Object 'xxxx' uses macro 'yyyy' 17 times.

Facts have no uniqueness constraints. For @vrana's reviewer responsiveness stuff, we can insert multiple rows for each reviewer, e.g.

  <"DREV:reviewed", "PHID-DREV-xxxx", "PHID-USER-yyyy", nnnn, ..., mmmm> # User 'yyyy' reviewed revision 'xxxx' after 'nnnn' seconds at 'mmmm'.

The second value (valueY) is mostly because we need it if we sample anything (valueX = observed value, valueY = sample rate) but there might be other uses. We might need to add "objectB" at some point too -- currently we can't represent a fact like "User X used macro Y on revision Z", so it would be impossible to compute macro use rates //for a specific user// based on this schema. I think we can start here though and see how far we get.

= Aggregated Facts =

These aren't implemented yet, but the idea is that we can then take the "raw facts" and compute derived/aggregated/rollup facts based on the raw fact table. For example, the "count" fact can be aggregated to arrive at a count of all objects in the system. This stuff will live in a separate table which does have uniqueness constraints, and come in the next diff.

We might need some kind of time series facts too, not sure about that. I think most of our use cases today are covered by raw facts + aggregated facts.

Test Plan: Ran `bin/fact` commands and verified they seemed to do reasonable things.

Reviewers: vrana, btrahan

Reviewed By: vrana

CC: aran, majak

Maniphest Tasks: T1562

Differential Revision: https://secure.phabricator.com/D3078
2012-07-27 13:34:21 -07:00