java - Why does parallelStream not use the entire available parallelism? -

i have custom forkjoinpool created parallelism of 25.

customforkjoinpool = new forkjoinpool(25);

i have list of 700 file names , used code download files s3 in parallel , cast them java objects:

customforkjoinpool.submit(() -> {    return filenames      .parallelstream()      .map((filename) -> {         logger log = logger.getlogger("forkjointest");         long starttime = system.currenttimemillis();         log.info("starting job @ thread:" + thread.currentthread().getname());         myobject obj = readobjectfroms3(filename);         long endtime = system.currenttimemillis();         log.info("completed job latency:" + (endtime - starttime));         return obj;      })      .collect(collectors.tolist);    }); });

when @ logs, see 5 threads being used. parallelism of 25, expected use 25 threads. average latency download , convert file object around 200ms. missing?

may better question how parallelstream figure how split original list before creating threads it? in case, looks decided split 5 times , stop.

why doing forkjoinpool? it's meant cpu-bound tasks subtasks fast warrant individual scheduling. workload io-bound , 200ms latency individual scheduling overhead negligible.

use executor:

import static java.util.stream.collectors.tolist; import static java.util.concurrent.completablefuture.supplyasync;  executorservice threads = executors.newfixedthreadpool(25);  list<myobject> result = filenames.stream()         .map(fn -> supplyasync(() -> readobjectfroms3(fn), threads))         .collect(tolist()).stream()         .map(completablefuture::join)         .collect(tolist());

Search This Blog

Szoka

java - Why does parallelStream not use the entire available parallelism? -

Comments

Post a Comment

Popular posts from this blog

python - Creating a new virtualenv gives a permissions error -

facebook - android ACTION_SEND to share with specific application only -

go - Idiomatic way to handle template errors in golang -