Skip to content

fix gpu thor #719

Merged
johnnynunez merged 4 commits into
rbonghi:masterfrom
johnnynunez:master
Oct 7, 2025
Merged

fix gpu thor #719
johnnynunez merged 4 commits into
rbonghi:masterfrom
johnnynunez:master

Conversation

@johnnynunez
Copy link
Copy Markdown
Collaborator

@johnnynunez johnnynunez commented Oct 7, 2025

Summary by Sourcery

Provide a robust mechanism for locating and installing service and environment variable templates, introduce a new install-service CLI flag, refine GPU detection logic, and update service-related error messages to use the new flag.

New Features:

  • Add an --install-service option to install and enable the jtop systemd service and its environment variables.
  • Discover variable and service template files using package metadata, virtual environment roots, sysconfig, and site user base paths.

Enhancements:

  • Refactor install_variables and install_service to use new template resolution helpers and raise JtopException when templates are missing.
  • Improve GPU detection by normalizing device names, filtering out non-device entries, and supporting a broader set of GPU identifiers.
  • Track NVML device count in GPUService for more accurate monitoring.
  • Update client startup errors to recommend running 'sudo jtop --install-service' instead of systemctl commands.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Oct 7, 2025

Reviewer's Guide

This PR refactors the installation of environment variables and systemd services to locate templates dynamically (from package sources, distribution metadata, or shared dirs), enhances GPU detection logic to normalize and filter device names, adds a top-level "--install-service" CLI command for full setup, and updates service-related error messages to point to the new command.

Sequence diagram for the new --install-service CLI workflow

sequenceDiagram
    actor User
    participant CLI as "jtop CLI"
    participant OS as "Operating System"
    participant Variables as "install_variables()"
    participant ServicePerm as "set_service_permission()"
    participant Service as "install_service()"

    User->>CLI: Run "jtop --install-service"
    CLI->>OS: Check superuser privileges
    alt Not superuser
        CLI->>User: Print "Install requires superuser privileges"
        CLI->>OS: Exit
    else Is superuser
        CLI->>Variables: install_variables(package_root, copy)
        CLI->>ServicePerm: set_service_permission()
        CLI->>Service: install_service(package_root, copy)
        CLI->>User: Print "jtop.service installed and enabled"
        CLI->>OS: Exit
    end
Loading

File-Level Changes

Change Details Files
Dynamic template resolution and refactored install_variables
  • Imported Path, metadata and JtopException
  • Implemented _resolve_distribution_path to search package and distribution
  • Added _variables_template_path to aggregate search paths
  • Updated install_variables to use template lookup and raise on missing file
jtop/core/jetson_variables.py
Dynamic template resolution and refactored install_service
  • Imported Path, metadata and JtopException
  • Implemented _resolve_distribution_path to search package and distribution
  • Added _service_template_path to aggregate shared paths
  • Updated install_service to use template lookup and raise on missing file
jtop/service.py
Enhanced GPU detection and NVML tracking
  • Introduced KNOWN_GPU_DEVICE_NAMES set and normalized device names
  • Skip non-device entries and missing of_node name files
  • Preserved railgate/3d_scaling detection
  • Added _nvml_device_count field when NVML initializes
jtop/core/gpu.py
New CLI --install-service flow
  • Added --install-service argument to main parser
  • Enforced superuser check and developer mode detection
  • Combined install_variables, set_service_permission and install_service calls
  • Handled JtopException and exit codes
jtop/__main__.py
Updated service startup error messages
  • Changed exception messages in jtop.py to recommend 'sudo jtop --install-service'
  • Replaced systemctl restart references with the new command
jtop/jtop.py

Possibly linked issues

  • #[AGX Thor Developer Kit]: The PR updates jetson-stats' GPU detection logic to include the NVIDIA Jetson AGX Thor Developer Kit, directly addressing the issue.
  • #N/A: The PR improves GPU detection for Orin and provides a new command to correctly install and activate the jtop service, directly addressing the reported service inactivity and potential underlying issues.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • You’ve introduced nearly identical template lookup logic in jetson_variables.py and service.py; consider extracting it into a shared helper to avoid duplication and ease maintenance.
  • The template path resolution scans many directories on every install; consider caching resolved paths or limiting the search scope for better performance.
  • Path handling is mixed between os.path and pathlib—migrating fully to pathlib could simplify code and improve readability.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- You’ve introduced nearly identical template lookup logic in jetson_variables.py and service.py; consider extracting it into a shared helper to avoid duplication and ease maintenance.
- The template path resolution scans many directories on every install; consider caching resolved paths or limiting the search scope for better performance.
- Path handling is mixed between os.path and pathlib—migrating fully to pathlib could simplify code and improve readability.

## Individual Comments

### Comment 1
<location> `jtop/__main__.py:102-106` </location>
<code_context>
+            print("Install requires superuser privileges. Please run:\nsudo jtop --install-service")
+            sys.exit(1)
+        logging.basicConfig(level=logging.INFO, format='[%(levelname)s] %(name)s - %(message)s')
+        package_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
+        is_developer = os.path.isdir(os.path.join(package_root, 'tests'))
+        copy = not is_developer
</code_context>

<issue_to_address>
**suggestion:** The logic for determining package_root may not be robust in all installation scenarios.

This approach may fail if the directory structure differs from expectations. Using importlib.resources or pkg_resources can provide a more reliable way to locate package resources.

```suggestion
        import importlib.resources
        try:
            # Try to get the package root using importlib.resources
            package_root = importlib.resources.files("jtop").parent
            package_root = str(package_root)
        except Exception:
            # Fallback to previous method if importlib.resources fails
            package_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
        is_developer = os.path.isdir(os.path.join(package_root, 'tests'))
        copy = not is_developer
        if not is_developer:
            package_root = None
```
</issue_to_address>

### Comment 2
<location> `jtop/core/gpu.py:166-168` </location>
<code_context>
-                    # Extract real path GPU device
-                    path = os.path.realpath(os.path.join(item_path, "device"))
-                    frq_path = os.path.realpath(item_path)
-                    igpu[name] = {'type': 'integrated', 'path': path, 'frq_path': frq_path}
-                    logger.info(f"GPU \"{name}\" status in {path}")
-                    logger.info(f"GPU \"{name}\" frq in {frq_path}")
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Using the original device name as the dictionary key may cause inconsistencies.

Using unnormalized device names as keys can result in inconsistent dictionary access due to differences in case or whitespace. Use normalized_name to ensure reliable lookups.

```suggestion
                    normalized_name = name.strip().lower().replace(" ", "_")
                    igpu[normalized_name] = {'type': 'integrated', 'path': path, 'frq_path': frq_path}
                    logger.info(f"GPU \"{normalized_name}\" status in {path}")
                    logger.info(f"GPU \"{normalized_name}\" frq in {frq_path}")
```
</issue_to_address>

### Comment 3
<location> `jtop/__main__.py:98` </location>
<code_context>
def main():
    parser = argparse.ArgumentParser(
        description='jtop is system monitoring utility and runs on terminal',
        formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--force', dest='force', help=argparse.SUPPRESS, action="store_true", default=False)
    parser.add_argument('--health', dest="health", help='Status jtop and fix', action="store_true", default=False)
    parser.add_argument('--install-service', dest="install_service", help='Install and enable the jtop systemd service',
                        action="store_true", default=False)
    parser.add_argument('--error-log', dest="log", help='Generate a log for GitHub', action="store_true", default=False)
    parser.add_argument('--no-warnings', dest="no_warnings", help='Do not show warnings', action="store_true", default=False)
    parser.add_argument('--restore', dest="restore", help='Reset Jetson configuration', action="store_true", default=False)
    parser.add_argument('--loop', dest="loop", help='Automatically switch page every {sec}s'.format(sec=LOOP_SECONDS), action="store_true", default=False)
    parser.add_argument('--color-filter', dest="color_filter",
                        help='Change jtop base colors, you can use also JTOP_COLOR_FILTER=True', action="store_true", default=False)
    parser.add_argument('-r', '--refresh', dest="refresh", help='refresh interval', type=int, default='1000')
    parser.add_argument('-p', '--page', dest="page", help='Open fix page', type=int, default=1)
    parser.add_argument('-v', '--version', action='version', version='%(prog)s {version}'.format(version=get_var(VERSION_RE)))
    # Parse arguments
    args = parser.parse_args()
    # Install jtop service if requested
    if args.install_service:
        if os.geteuid() != 0:
            print("Install requires superuser privileges. Please run:\nsudo jtop --install-service")
            sys.exit(1)
        logging.basicConfig(level=logging.INFO, format='[%(levelname)s] %(name)s - %(message)s')
        package_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
        is_developer = os.path.isdir(os.path.join(package_root, 'tests'))
        copy = not is_developer
        if not is_developer:
            package_root = None
        try:
            from .core.jetson_variables import install_variables
            from .service import set_service_permission, install_service
            install_variables(package_root, copy=copy)
            set_service_permission()
            install_service(package_root, copy=copy)
        except JtopException as e:
            print(e)
            sys.exit(1)
        print("jtop.service installed and enabled")
        sys.exit(0)
    # Initialize signals
    # signal.signal(signal.SIGINT, exit_signal)  # Do not needed equivalent to exception KeyboardInterrupt
    signal.signal(signal.SIGTERM, exit_signal)
    # Run jtop service
    if os.getenv('JTOP_SERVICE', False):
        # Initialize logging level
        logging.basicConfig(level=logging.INFO, filemode='w', format='[%(levelname)s] %(name)s - %(message)s')
        # Run service
        try:
            # Initialize stats server
            server = JtopServer(force=args.force)
            server.loop_for_ever()
        except JtopException as e:
            print(e)
        # Close service
        sys.exit(0)
    # Initialize logging level
    logging.basicConfig()
    # Convert refresh to second
    interval = float(args.refresh / 1000.0)
    # Restore option
    if args.restore:
        try:
            with jtop(interval=interval) as jetson:
                # Write warnings
                if 'L4T' in jetson.board['hardware']:
                    warning_messages(jetson, args.no_warnings)
                # Restore configuration
                if jetson.ok():
                    for status, name in jetson.restore():
                        status = bcolors.ok() if status else bcolors.fail()
                        print(" [{status}] {name}".format(name=name, status=status))
        except JtopException as e:
            print(e)
        # Close service
        sys.exit(0)
    # Run health jtop
    if args.health:
        jtop_config()
    # Generate a log for GitHub
    if args.log:
        body = get_hardware_log()
        body += "\n\nLog from jtop {version}\n".format(version=get_var(VERSION_RE))
        with open('{cwd}/{name}'.format(cwd=os.getcwd(), name=JTOP_LOG_NAME), 'w') as writer:
            writer.write(body)
        print("LOG '{name}' generated in {path}".format(name=JTOP_LOG_NAME, path=os.getcwd()))
        sys.exit(0)
    # jtop client start
    try:
        # https://stackoverflow.com/questions/56373360/n-curses-within-python-how-to-catch-and-print-non-ascii-character
        # Commented for issues #466 #393
        # locale.setlocale(locale.LC_ALL, '')
        # Open jtop client
        with jtop(interval=interval) as jetson:
            # Call the curses wrapper
            color_filter = bool(os.getenv('JTOP_COLOR_FILTER', args.color_filter))
            # Build list pages available
            pages = [ALL]
            if jetson.gpu:
                pages += [GPU]
            pages += [CPU, MEM]
            if jetson.engine:
                pages += [ENGINE]
            if jetson.fan or jetson.jetson_clocks is not None or jetson.nvpmodel is not None:
                pages += [CTRL]
            pages += [INFO]
            curses.wrapper(JTOPGUI, jetson, pages, init_page=args.page,
                           loop=args.loop, seconds=LOOP_SECONDS, color_filter=color_filter)
            # Write warnings
            if 'L4T' in jetson.board['hardware']:
                warning_messages(jetson, args.no_warnings)
    except (KeyboardInterrupt, SystemExit):
        pass
    except JtopException as e:
        print(e)

</code_context>

<issue_to_address>
**issue (code-quality):** Extract code out into function [×2] ([`extract-method`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/extract-method/))
</issue_to_address>

### Comment 4
<location> `jtop/core/jetson_variables.py:406` </location>
<code_context>
def _variables_template_path(package_root, name):
    candidates = []
    if package_root:
        base = Path(package_root)
        candidates.append(base / 'scripts' / name)
        candidates.append(base / name)
    distribution_path = _resolve_distribution_path(
        Path('scripts') / name,
        Path('share/jetson_stats') / name,
        Path(name),
    )
    if distribution_path:
        candidates.append(distribution_path)

    shared_roots = {
        Path(prefix)
        for prefix in (
            sys.prefix,
            getattr(sys, 'base_prefix', None),
            sys.exec_prefix,
            getattr(sys, 'base_exec_prefix', None),
            os.environ.get('VIRTUAL_ENV'),
        )
        if prefix
    }
    try:
        import sysconfig
    except ImportError:
        sysconfig = None
    if sysconfig:
        data_path = sysconfig.get_paths().get('data')
        if data_path:
            shared_roots.add(Path(data_path))
    try:
        import site
    except ImportError:
        site = None
    if site:
        user_base = site.getuserbase()
        if user_base:
            shared_roots.add(Path(user_base))
    for root in shared_roots:
        candidates.append(root / 'share' / 'jetson_stats' / name)
    for candidate in candidates:
        if candidate.exists():
            return candidate
    raise JtopException('Unable to locate {name} template file'.format(name=name))

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional [×3] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Replace a for append loop with list extend ([`for-append-to-extend`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/for-append-to-extend/))
- Merge consecutive list appends into a single extend ([`merge-list-appends-into-extend`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/merge-list-appends-into-extend/))
</issue_to_address>

### Comment 5
<location> `jtop/service.py:128` </location>
<code_context>
def _service_template_path(package_root, name):
    candidates = []
    if package_root:
        base = Path(package_root)
        candidates.append(base / 'services' / name)
        candidates.append(base / name)
    distribution_path = _resolve_distribution_path(
        Path('services') / name,
        Path('share/jetson_stats') / name,
        Path(name),
    )
    if distribution_path:
        candidates.append(distribution_path)

    shared_roots = {
        Path(prefix)
        for prefix in (
            sys.prefix,
            getattr(sys, 'base_prefix', None),
            sys.exec_prefix,
            getattr(sys, 'base_exec_prefix', None),
            os.environ.get('VIRTUAL_ENV'),
        )
        if prefix
    }
    try:
        import sysconfig
    except ImportError:
        sysconfig = None
    if sysconfig:
        data_path = sysconfig.get_paths().get('data')
        if data_path:
            shared_roots.add(Path(data_path))
    try:
        import site
    except ImportError:
        site = None
    if site:
        user_base = site.getuserbase()
        if user_base:
            shared_roots.add(Path(user_base))
    for root in shared_roots:
        candidates.append(root / 'share' / 'jetson_stats' / name)
    for candidate in candidates:
        if candidate.exists():
            return candidate
    raise JtopException('Unable to locate {name} template file'.format(name=name))

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional [×3] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Replace a for append loop with list extend ([`for-append-to-extend`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/for-append-to-extend/))
- Merge consecutive list appends into a single extend ([`merge-list-appends-into-extend`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/merge-list-appends-into-extend/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread jtop/__main__.py
Comment on lines +102 to +106
package_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
is_developer = os.path.isdir(os.path.join(package_root, 'tests'))
copy = not is_developer
if not is_developer:
package_root = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The logic for determining package_root may not be robust in all installation scenarios.

This approach may fail if the directory structure differs from expectations. Using importlib.resources or pkg_resources can provide a more reliable way to locate package resources.

Suggested change
package_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
is_developer = os.path.isdir(os.path.join(package_root, 'tests'))
copy = not is_developer
if not is_developer:
package_root = None
import importlib.resources
try:
# Try to get the package root using importlib.resources
package_root = importlib.resources.files("jtop").parent
package_root = str(package_root)
except Exception:
# Fallback to previous method if importlib.resources fails
package_root = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
is_developer = os.path.isdir(os.path.join(package_root, 'tests'))
copy = not is_developer
if not is_developer:
package_root = None

Comment thread jtop/core/gpu.py
Comment on lines +166 to +168
igpu[name] = {'type': 'integrated', 'path': path, 'frq_path': frq_path}
logger.info(f"GPU \"{name}\" status in {path}")
logger.info(f"GPU \"{name}\" frq in {frq_path}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Using the original device name as the dictionary key may cause inconsistencies.

Using unnormalized device names as keys can result in inconsistent dictionary access due to differences in case or whitespace. Use normalized_name to ensure reliable lookups.

Suggested change
igpu[name] = {'type': 'integrated', 'path': path, 'frq_path': frq_path}
logger.info(f"GPU \"{name}\" status in {path}")
logger.info(f"GPU \"{name}\" frq in {frq_path}")
normalized_name = name.strip().lower().replace(" ", "_")
igpu[normalized_name] = {'type': 'integrated', 'path': path, 'frq_path': frq_path}
logger.info(f"GPU \"{normalized_name}\" status in {path}")
logger.info(f"GPU \"{normalized_name}\" frq in {frq_path}")

Comment thread jtop/__main__.py
args = parser.parse_args()
# Install jtop service if requested
if args.install_service:
if os.geteuid() != 0:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Extract code out into function [×2] (extract-method)



def _variables_template_path(package_root, name):
candidates = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:

Comment thread jtop/service.py


def _service_template_path(package_root, name):
candidates = []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:

@johnnynunez johnnynunez merged commit 030415e into rbonghi:master Oct 7, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant