Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTEX broken on Mac OS X #854

Open
tilmantroester opened this issue Apr 8, 2019 · 8 comments

Comments

Projects
None yet
5 participants
@tilmantroester
Copy link

commented Apr 8, 2019

I have trouble getting HTEX to work. Maybe I'm doing something fundamentally wrong but the following MWE gets stuck for me for parsl 0.7.2:

import parsl
import os
from parsl.app.app import python_app
from parsl.configs.htex_local import config as htex_config
from parsl.configs.local_ipp import config as ipp_config

@python_app
def dummy_app(inputs=[], outputs=[]):
    return "Dummy app finished!"

if __name__ == "__main__":
    parsl.set_stream_logger()
    print(f"Using parsl version {parsl.__version__}")

    # This works
    # parsl.load(ipp_config)
    # This doesn't work
    parsl.load(htex_config)

    results = dummy_app()
    print(results.result())

The logger output is

2019-04-08 16:20:04 parsl.executors.high_throughput.executor:483 [DEBUG]  Launched block 0:32494
2019-04-08 16:20:04 parsl.dataflow.strategy:125 [DEBUG]  Scaling strategy: simple
2019-04-08 16:20:04 parsl.dataflow.dflow:670 [INFO]  Task 0 submitted for App dummy_app, waiting on tasks []
2019-04-08 16:20:04 parsl.dataflow.dflow:680 [DEBUG]  Task 0 set to pending state with AppFuture: <AppFuture at 0x103b29d30 state=pending>
2019-04-08 16:20:04 parsl.executors.high_throughput.executor:452 [DEBUG]  Pushing function <function dummy_app at 0x103b2d620> to queue with args ()
2019-04-08 16:20:04 parsl.dataflow.dflow:458 [INFO]  Task 0 launched on executor htex_local
2019-04-08 16:20:05 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 16:20:05 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('617cd0ccc1bc', 0, True)]
2019-04-08 16:20:05 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 16:20:10 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 16:20:10 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('617cd0ccc1bc', 0, True)]
2019-04-08 16:20:10 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines

with the last 3 messages repeating without finishing the program.

If I instead use the HTEX config from the docs (https://parsl.readthedocs.io/en/latest/parsl-introduction.html#Local-execution-with-pilot-jobs):

htex_config = Config(
    executors=[
        HighThroughputExecutor(
            label="htex_Local",
            worker_debug=True,
            cores_per_worker=1,
            provider=LocalProvider(
                channel=LocalChannel(),
                init_blocks=1,
                max_blocks=1,
            ),
        )
    ],
    strategy=None,
)

the program just hangs after

2019-04-08 16:18:36 parsl.executors.high_throughput.executor:483 [DEBUG]  Launched block 0:32436
2019-04-08 16:18:36 parsl.dataflow.strategy:125 [DEBUG]  Scaling strategy: None
2019-04-08 16:18:36 parsl.dataflow.dflow:670 [INFO]  Task 0 submitted for App dummy_app, waiting on tasks []
2019-04-08 16:18:36 parsl.dataflow.dflow:680 [DEBUG]  Task 0 set to pending state with AppFuture: <AppFuture at 0x107abdef0 state=pending>
2019-04-08 16:18:36 parsl.executors.high_throughput.executor:452 [DEBUG]  Pushing function <function dummy_app at 0x107ac2620> to queue with args ()
2019-04-08 16:18:36 parsl.dataflow.dflow:458 [INFO]  Task 0 launched on executor htex_Local

Using IPP (parsl.configs.local_ipp.config) or threads (parsl.configs.local_threads.config) works, however.

@ZhuozhaoLi

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2019

Hi @tilmantroester, thanks for using Parsl.

I just tried your script with htex_local and it worked on my machine. I was wondering what machine you ran that on (e.g., login node of supercomputer or laptop). Could you please paste us the whole logs in runinfo/XXX directory.

We can help you debug that on our slack channel too (link: http://parsl-project.org/support.html)

@tilmantroester

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

I'm running this on my laptop (macOS 10.14.4, python 3.7).

All log files
parsl.log (I killed the process at the end):

Config(
    app_cache=True, 
    checkpoint_files=None, 
    checkpoint_mode=None, 
    checkpoint_period=None, 
    data_management_max_threads=10, 
    executors=[HighThroughputExecutor(
        address='127.0.0.1', 
        cores_per_worker=1, 
        heartbeat_period=30, 
        heartbeat_threshold=120, 
        interchange_port_range=(55000, 56000), 
        label='htex_local', 
        launch_cmd='process_worker_pool.py {debug} {max_workers} -c {cores_per_worker} --poll {poll_period} --task_url={task_url} --result_url={result_url} --logdir={logdir} --hb_period={heartbeat_period} --hb_threshold={heartbeat_threshold} ', 
        managed=True, 
        max_workers=inf, 
        poll_period=10, 
        provider=LocalProvider(
            channel=LocalChannel(
                envs={}, 
                script_dir=None, 
                userhome='/Users/yooken/Research/KiDS/pipeline_testing'
            ), 
            cmd_timeout=30, 
            init_blocks=1, 
            launcher=SingleNodeLauncher(), 
            max_blocks=1, 
            min_blocks=0, 
            move_files=None, 
            nodes_per_block=1, 
            parallelism=1, 
            walltime='00:15:00', 
            worker_init=''
        ), 
        storage_access=[], 
        suppress_failure=False, 
        worker_debug=False, 
        worker_port_range=(54000, 55000), 
        worker_ports=None, 
        working_dir=None
    )], 
    lazy_errors=True, 
    monitoring=None, 
    retries=0, 
    run_dir='runinfo', 
    strategy='simple', 
    usage_tracking=False
)
2019-04-08 18:27:45.801 parsl.dataflow.dflow:79 [INFO]  Parsl version: 0.7.2
2019-04-08 18:27:45.801 parsl.dataflow.usage_tracking.usage:126 [DEBUG]  Tracking status: False
2019-04-08 18:27:45.801 parsl.dataflow.usage_tracking.usage:127 [DEBUG]  Testing mode   : False
2019-04-08 18:27:45.802 parsl.dataflow.dflow:101 [INFO]  Run id is: e31088b3-5dfb-490a-9944-d8c41acbdf4b
2019-04-08 18:27:45.918 parsl.dataflow.memoization:52 [INFO]  App caching initialized
2019-04-08 18:27:45.921 parsl.executors.high_throughput.executor:392 [DEBUG]  Starting queue management thread
2019-04-08 18:27:45.921 parsl.executors.high_throughput.executor:274 [DEBUG]  [MTHREAD] queue management worker starting
2019-04-08 18:27:45.922 parsl.executors.high_throughput.executor:396 [DEBUG]  Started queue management thread
2019-04-08 18:27:45.948 parsl.executors.high_throughput.executor:233 [DEBUG]  Created management thread: <Thread(Thread-1, started daemon 123145499291648)>
2019-04-08 18:27:45.948 parsl.executors.high_throughput.executor:207 [DEBUG]  Launch command: process_worker_pool.py   -c 1 --poll 10 --task_url=tcp://127.0.0.1:54685 --result_url=tcp://127.0.0.1:54393 --logdir=/Users/yooken/Research/KiDS/pipeline_testing/runinfo/005/htex_local --hb_period=30 --hb_threshold=120 
2019-04-08 18:27:45.948 parsl.executors.high_throughput.executor:210 [DEBUG]  Starting HighThroughputExecutor with provider:
LocalProvider(
    channel=LocalChannel(
        envs={}, 
        script_dir='/Users/yooken/Research/KiDS/pipeline_testing/runinfo/005/submit_scripts', 
        userhome='/Users/yooken/Research/KiDS/pipeline_testing'
    ), 
    cmd_timeout=30, 
    init_blocks=1, 
    launcher=SingleNodeLauncher(), 
    max_blocks=1, 
    min_blocks=0, 
    move_files=None, 
    nodes_per_block=1, 
    parallelism=1, 
    walltime='00:15:00', 
    worker_init=''
)
2019-04-08 18:27:45.962 parsl.executors.high_throughput.executor:483 [DEBUG]  Launched block 0:37003
2019-04-08 18:27:45.963 parsl.dataflow.strategy:125 [DEBUG]  Scaling strategy: simple
2019-04-08 18:27:45.963 parsl.dataflow.dflow:670 [INFO]  Task 0 submitted for App dummy_app, waiting on tasks []
2019-04-08 18:27:45.964 parsl.dataflow.dflow:680 [DEBUG]  Task 0 set to pending state with AppFuture: <AppFuture at 0x1065a4cf8 state=pending>
2019-04-08 18:27:45.964 parsl.executors.high_throughput.executor:452 [DEBUG]  Pushing function <function dummy_app at 0x1065a7620> to queue with args ()
2019-04-08 18:27:45.966 parsl.dataflow.dflow:458 [INFO]  Task 0 launched on executor htex_local
2019-04-08 18:27:45.966 parsl.dataflow.dflow:786 [INFO]  Waiting for all remaining tasks to complete
2019-04-08 18:27:45.966 parsl.dataflow.dflow:792 [DEBUG]  Waiting for task 0 to complete
2019-04-08 18:27:46.968 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:27:46.969 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:27:46.969 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:27:51.967 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:27:51.967 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:27:51.967 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:27:56.968 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:27:56.968 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:27:56.969 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:01.973 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:01.974 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:01.974 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:06.977 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:06.977 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:06.978 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:11.982 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:11.983 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:11.983 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:16.984 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:16.985 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:16.985 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:21.987 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:21.988 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:21.988 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:26.992 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:26.993 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:26.993 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:31.993 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:31.994 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:31.994 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:36.994 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:36.995 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:36.995 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:41.995 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:41.996 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:41.997 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:46.999 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:47.000 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:47.000 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:52.003 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:52.014 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:52.015 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:28:57.007 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:28:57.007 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:28:57.008 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:02.012 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:02.013 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:02.013 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:07.015 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:07.016 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:07.016 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:12.018 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:12.019 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:12.019 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:17.020 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:17.020 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:17.021 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:22.024 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:22.024 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:22.025 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:27.024 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:27.025 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:27.025 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:32.028 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:32.028 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:32.029 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:37.033 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:37.034 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:37.034 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:42.034 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:42.035 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: [('e844a4104139', 0, True)]
2019-04-08 18:29:42.035 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 1 connected engines
2019-04-08 18:29:47.036 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:47.036 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:29:47.037 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:29:52.036 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:52.046 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:29:52.046 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:29:57.036 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:29:57.037 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:29:57.037 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:02.038 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:02.039 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:02.039 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:07.041 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:07.042 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:07.042 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:12.045 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:12.054 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:12.054 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:17.046 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:17.047 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:17.048 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:22.047 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:22.047 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:22.048 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:27.047 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:27.048 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:27.048 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:32.047 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:32.048 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:32.048 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:37.048 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:37.049 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:37.049 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:42.054 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:42.055 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:42.055 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:47.055 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:47.056 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:47.056 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:52.055 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:52.065 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:52.065 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:30:57.057 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:30:57.058 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:30:57.058 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:02.057 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:31:02.058 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:31:02.058 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:07.062 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:31:07.063 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:31:07.063 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:12.063 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:31:12.064 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:31:12.065 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:17.067 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:31:17.068 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:31:17.068 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:22.067 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:31:22.067 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:31:22.068 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:27.072 parsl.executors.high_throughput.executor:420 [DEBUG]  Got outstanding count: 1
2019-04-08 18:31:27.073 parsl.executors.high_throughput.executor:426 [DEBUG]  Got managers: []
2019-04-08 18:31:27.073 parsl.dataflow.strategy:206 [DEBUG]  Executor htex_local has 1 active tasks, 1/0/0 running/submitted/pending blocks, and 0 connected engines
2019-04-08 18:31:29.960 parsl.dataflow.dflow:805 [INFO]  DFK cleanup initiated
2019-04-08 18:31:29.960 parsl.dataflow.dflow:716 [INFO]  Summary of tasks in DFK:
2019-04-08 18:31:29.960 parsl.dataflow.dflow:747 [INFO]  Tasks in state States.launched: 0
2019-04-08 18:31:29.960 parsl.dataflow.dflow:754 [INFO]  End of summary
2019-04-08 18:31:29.960 parsl.dataflow.dflow:829 [INFO]  Terminating flow_control and strategy threads
2019-04-08 18:31:29.960 parsl.providers.local.local:231 [DEBUG]  Terminating job/proc_id: 37003
2019-04-08 18:31:29.961 parsl.executors.high_throughput.executor:531 [WARNING]  Attempting HighThroughputExecutor shutdown
2019-04-08 18:31:29.961 parsl.executors.high_throughput.executor:535 [WARNING]  Finished HighThroughputExecutor shutdown attempt
2019-04-08 18:31:29.961 parsl.data_provider.data_manager:99 [DEBUG]  Done with executor shutdown
2019-04-08 18:31:29.961 parsl.dataflow.dflow:861 [INFO]  DFK cleanup complete
@yadudoc

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

@tilmantroester Thanks for reporting this issue. We've unfortunately had very limited testing on Mac systems and I am not surprised that there would be some Mac specific issues.

Looking through the logs, it looks like most of the executor system seems to have come online correctly.
If you do get a chance to try something, could you please retry the test with worker_debug=True set in the executor config and share the logs here?

@annawoodard is a Mac user and she's going to take a crack at this issue soon.

@yadudoc yadudoc added the bug label Apr 9, 2019

@yadudoc yadudoc added this to the Parsl-0.8.0 milestone Apr 9, 2019

@tilmantroester

This comment has been minimized.

Copy link
Author

commented Apr 10, 2019

Thanks for looking into this! Ultimately, I'd deploy it on a Linux machine but for development purposes it be nice to test things locally.
The logs with worker_debug=True are here.

Here's the MWE used to create those logs:

import parsl
import os
from parsl.app.app import python_app
# from parsl.configs.htex_local import config as htex_config
from parsl.configs.local_ipp import config as ipp_config
from parsl.configs.local_threads import config as threads_config

from parsl.providers import LocalProvider
from parsl.channels import LocalChannel
from parsl.config import Config
from parsl.executors import HighThroughputExecutor

htex_config = Config(
    executors=[
        HighThroughputExecutor(
            label="htex_Local",
            worker_debug=True,
            cores_per_worker=1,
            provider=LocalProvider(
                channel=LocalChannel(),
                init_blocks=1,
                max_blocks=1,
            ),
        )
    ],
)

@python_app
def dummy_app():
    return "Dummy app finished!"

if __name__ == "__main__":
    parsl.set_stream_logger()
    print(f"Using parsl version {parsl.__version__}")

    parsl.clear()
    # This works
    # parsl.load(ipp_config)
    # parsl.load(threads_config)
    
    # This doesn't work
    parsl.load(htex_config)

    print("Launching dummy app")
    results = dummy_app()
    print("Launched dummy app")
    parsl.wait_for_current_tasks()
    # print(results.result())

interchange.log contains a a lot of repeating

2019-04-10 09:58:10.306 interchange:233 [DEBUG]  [TASK_PULL_THREAD] 1 tasks in internal queue
2019-04-10 09:58:10.314 interchange:400 [DEBUG]  Managers count (total/interesting): 1/0
2019-04-10 09:58:10.314 interchange:430 [DEBUG]  [MAIN] either no interesting managers or no tasks, so skipping manager pass
2019-04-10 09:58:10.315 interchange:447 [DEBUG]  [MAIN] entering bad_managers section
2019-04-10 09:58:10.315 interchange:460 [DEBUG]  [MAIN] leaving bad_managers section
2019-04-10 09:58:10.316 interchange:461 [DEBUG]  [MAIN] ending one main loop iteration
@annawoodard

This comment has been minimized.

Copy link
Collaborator

commented Apr 12, 2019

It looks like this is because qsize is not implemented on Mac:

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Queue.qsize

Note that this may raise NotImplementedError on Unix platforms like Mac OS X where sem_getvalue() is not implemented.

Someone else already solved this problem, their approach looks reasonable to me: vterron/lemon@9ca6b4b

@tilmantroester

This comment has been minimized.

Copy link
Author

commented Apr 23, 2019

While this is being implemented, is there a workaround that I could use? Or should I just stick with IPP for now?

@yadudoc

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

@tilmantroester I suspect that fixes to HTEX will take a bit of time, so I'd recommend using IPP for local and switching to HTEX when running on the remote system. Just to be clear, you'd have to be running completely on linux based systems for HTEX to work right now.

@annawoodard annawoodard removed their assignment Apr 25, 2019

@yadudoc yadudoc modified the milestones: Parsl-0.8.0, Parsl-0.9.0 May 22, 2019

@annawoodard annawoodard changed the title HTEX hangs for parsl.configs.htex_local.config and config from docs HTEX broken on Mac OS X May 22, 2019

@danielskatz

This comment has been minimized.

Copy link
Collaborator

commented Jun 5, 2019

I'll confirm this is a problem that makes the last cell in the default tutorial hang on my mac as well

@danielskatz danielskatz added the tutorial label Jun 5, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.