Releases · IBM/fmwork

25 Aug 14:49

nelsonspbr

v1.0.4

13971cc

v1.0.4 Latest

Latest

infer/vllm/client

Removed --base-url http://localhost:8000 This may require changes to
downstream automation.

infer/vllm/process

Added --precision with a fp16 default value.
Added code to detect batch mode ('static' or 'continuous') for Spyre
integration. This requires VLLM_SPYRE_USE_CB to be explicitly defined and
printed in the server.log file. Note that this should be done automatically
by the runner - server integration.
Changed TTFT metric from server's TTFT (via /metrics) to client's
Changed ITL metric from Mean TPOT to Median ITL, as reported by vLLM's
serving benchmark.
To better support experiments with datasets other than random (which
explicitly allows the definition of shapes); if such definition is not found
in the log files (e.g., if sharegpt dataset was used), process will read
the appropriate lines from client.log to get the average input / output
sizes.

Assets 2

13 Aug 06:57

nelsonspbr

v1.0.3

87f7ae2

v1.0.3

Finalized server-mode support for infer/vllm and added documentation.

Assets 2

12 Aug 14:14

nelsonspbr

v1.0.2

780af6b

v1.0.2

Finalize support for direct and server modes for infer/vllm, including process script.

Documentation pending — to be added momentarily.

Assets 2

01 Aug 20:39

nelsonspbr

v1.0.1

99d4ea6

v1.0.1

General improvements to embed/tf.

Improved output formatting for arguments.
Added processing script.
Oh, and a README ☺️

Assets 2

01 Aug 08:14

nelsonspbr

v1.0.0

5358a87

v1.0.0

Still a partial release — but now with the latest scripts to run encoder models on CPUs / GPUs / Spyre. Subsequent releases will cover decoder models, as well as more options / different engines.

Assets 2

30 Apr 14:14

nelsonspbr

v0.1.0

5a98eda

v0.1.0

Freezing working version that is currently being used for internal experimental sweeps.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: IBM/fmwork

v1.0.4

Uh oh!

v1.0.3

Uh oh!

v1.0.2

Uh oh!

v1.0.1

Uh oh!

v1.0.0

Uh oh!

v0.1.0

Uh oh!