hassio-addons/addon-node-red

Node-red add-on is crashing/restarting a few times per day #785

b3nj1 posted onGitHub

Problem/Motivation

Since about last Wednesday, my node-red add-on has been restarting itself a few times per day. I think that roughly corresponds to when I updated from 7.2.10 to 7.2.11.

I had updated a flow around this time as well, so my first assumption was that my flow was at fault. Unfortunately, that doesn't seem to be the case. I have reverted to before my flow update, but the problem still remains.

Expected behavior

No crashes/restarts. I've been using the add-on for over a year and it has been very reliable until now.

Actual behavior

I see this in the debug logs (accessed via Portainer). I can't be sure this is related, but I don't recall seeing this before.

0 info it worked if it ends with ok
1 verbose cli [
1 verbose cli   '/usr/bin/node',
1 verbose cli   '/usr/bin/npm',
1 verbose cli   'start',
1 verbose cli   '--',
1 verbose cli   '--settings',
1 verbose cli   '/etc/node-red/config.js'
1 verbose cli ]
2 info using npm@6.14.6
3 info using node@v12.18.4
4 verbose run-script [ 'prestart', 'start', 'poststart' ]
5 info lifecycle addon-node-red@~prestart: addon-node-red@
6 info lifecycle addon-node-red@~start: addon-node-red@
7 verbose lifecycle addon-node-red@~start: unsafe-perm in lifecycle true
8 verbose lifecycle addon-node-red@~start: PATH: /usr/lib/node_modules/npm/node_modules/npm-lifecycle/node-gyp-bin:/opt/node_modules/.bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
9 verbose lifecycle addon-node-red@~start: CWD: /opt
10 silly lifecycle addon-node-red@~start: Args: [
10 silly lifecycle   '-c',
10 silly lifecycle   'node $NODE_OPTIONS node_modules/node-red/red.js "--settings" "/etc/node-red/config.js"'
10 silly lifecycle ]
11 silly lifecycle addon-node-red@~start: Returned: code: 1  signal: null
12 info lifecycle addon-node-red@~start: Failed to exec start script
13 verbose stack Error: addon-node-red@ start: `node $NODE_OPTIONS node_modules/node-red/red.js "--settings" "/etc/node-red/config.js"`
13 verbose stack Exit status 1
13 verbose stack     at EventEmitter.<anonymous> (/usr/lib/node_modules/npm/node_modules/npm-lifecycle/index.js:332:16)
13 verbose stack     at EventEmitter.emit (events.js:315:20)
13 verbose stack     at ChildProcess.<anonymous> (/usr/lib/node_modules/npm/node_modules/npm-lifecycle/lib/spawn.js:55:14)
13 verbose stack     at ChildProcess.emit (events.js:315:20)
13 verbose stack     at maybeClose (internal/child_process.js:1021:16)
13 verbose stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:286:5)
14 verbose pkgid addon-node-red@
15 verbose cwd /opt
16 verbose Linux 4.19.127-v8
17 verbose argv "/usr/bin/node" "/usr/bin/npm" "start" "--" "--settings" "/etc/node-red/config.js"
18 verbose node v12.18.4
19 verbose npm  v6.14.6
20 error code ELIFECYCLE
21 error errno 1
22 error addon-node-red@ start: `node $NODE_OPTIONS node_modules/node-red/red.js "--settings" "/etc/node-red/config.js"`
22 error Exit status 1
23 error Failed at the addon-node-red@ start script.
23 error This is probably not a problem with npm. There is likely additional logging output above.
24 verbose exit [ 1, true ]

Steps to reproduce

I don't know if this is related to my flows or not. I have many flows....

Proposed changes

Sorry, I do not.


I have no idea what to make of this... As in, there is not much to go from in the logs. Are you sure it is not an external factor? Like running out of memory that causes the OS to kill it?

posted by frenck over 4 years ago

@frenck , thanks for taking a look. I'm not sure of anything :)

As for memory, I'm running on a Raspberry Pi 4 with 4GB of SRAM. My system metrics currently read the following, but the system is running fine now.

Is there another log should grab or debug flag I should enable?

Core CPU Usage
3.4%
Core RAM Usage
25.2%
Supervisor CPU Usage
0.9%
Supervisor RAM Usage
28%
Used Space
30.4%

Thanks, Benjamin

posted by b3nj1 over 4 years ago

Honestly, I'm not sure what to enable or see in this case. Best if you know an exact time, and look as all logs available. Including system and supervisor.

posted by frenck over 4 years ago

It just happened, so I was able to see more info in the node-red log from the Supervisor tab. BTW, is there a way to see the full log?

This looks more interesting.


[30/Nov/2020:11:41:40 -0800] 200 76.21.78.185, 172.30.32.1(172.30.32.2) GET /icons/node-red-contrib-home-assistant-websocket/ha-events-state-changed.svg HTTP/1.1 (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.1 Safari/605.1.15)
30 Nov 11:41:51 - [red] Uncaught Exception:
30 Nov 11:41:51 - TypeError: Cannot read property 'state' of null
    at ServerStateChangedNode.onHaEventsStateChanged (/opt/node_modules/node-red-contrib-home-assistant-websocket/nodes/events-state-changed/events-state-changed.js:84:45)
    at HaWebsocket.emit (events.js:327:22)
    at HaWebsocket.onClientEvents (/opt/node_modules/node-red-contrib-home-assistant-websocket/lib/ha-websocket.js:290:22)
    at Object.callback (/opt/node_modules/node-red-contrib-home-assistant-websocket/lib/ha-websocket.js:229:31)
    at Connection._handleMessage (/opt/node_modules/home-assistant-js-websocket/dist/haws.cjs:353:26)
    at WebSocket.<anonymous> (/opt/node_modules/home-assistant-js-websocket/dist/haws.cjs:191:57)
    at WebSocket.onMessage (/opt/node_modules/node-red-contrib-home-assistant-websocket/node_modules/ws/lib/event-target.js:125:16)
    at WebSocket.emit (events.js:315:20)
    at Receiver.receiverOnMessage (/opt/node_modules/node-red-contrib-home-assistant-websocket/node_modules/ws/lib/websocket.js:797:20)
    at Receiver.emit (events.js:315:20)
[30/Nov/2020:11:41:51 -0800] 101 76.21.78.185, 172.30.32.1(172.30.32.2) GET /comms HTTP/1.1 (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.1 Safari/605.1.15)
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! addon-node-red@ start: `node $NODE_OPTIONS node_modules/node-red/red.js "--settings" "/etc/node-red/config.js"`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the addon-node-red@ start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2020-11-30T19_41_51_583Z-debug.log

In my event log, the only thing that happens at that time is all of my switches/sensors that are defined in node-red go offline.

Halloween Lights became unavailable
11:41:51 AM - 8 minutes ago
frontend_theme became unavailable
11:41:51 AM - 8 minutes ago
garage_door_ifttt_disable became unavailable
11:41:51 AM - 8 minutes ago
Garage Door Proxy State became unavailable
11:41:51 AM - 8 minutes ago
posted by b3nj1 over 4 years ago
posted by b3nj1 over 4 years ago

FYI - @zachowj

I'm trying this fix: https://github.com/zachowj/node-red-contrib-home-assistant-websocket/compare/dev...b3nj1:master

Will see if I get more crashes.

posted by b3nj1 over 4 years ago

FYI - I screwed up my git repo, so the above change is garbage. I've just locally hacked my Hassio instance using Portainer to check oldState validity (https://github.com/hassio-addons/addon-node-red/issues/785#issuecomment-736013011).

Unfortunately, I've gotten one reboot with that change in 24 hours. That's better then before the change, but still something is up. My log head was chopped, so I only see the last line of the backtrace. So, I don't know where the next issue is yet.

@frenck , Is there a way to get the full log instead of the tail of the log that shows in the Supervisor -> Addons -> NodeRed -> Log tab?

Thanks, Benjamin

posted by b3nj1 over 4 years ago

OK - I hit it again, and the same lines are causing a problem. I think it's the newState that is null based on what I see in the Home Assistant log.

I think my iOS Home Assistant app is injecting events incorrectly and causing the null state objects. I'll re-setup the integration to try to clean up my instance. Not sure if I'll be able to trace back further on this to figure out where these errors should be caught more gracefully.

Logger: homeassistant.components.mobile_app.notify
Source: components/mobile_app/notify.py:53 
Integration: Mobile App (documentation, issues) 
First occurred: November 27, 2020, 12:49:42 PM (404 occurrences) 
Last logged: 1:55:24 PM

Found duplicate device name bkd11

Home Assistant Log:

2020-12-01 13:59:04 ERROR (MainThread) [homeassistant] Error doing job: Future exception was never retrieved
Traceback (most recent call last):
File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/src/homeassistant/homeassistant/components/proximity/__init__.py", line 157, in check_proximity_state_change
entity_name = new_state.name
AttributeError: 'NoneType' object has no attribute 'name'

Node Red log:

1 Dec 13:59:04 - [red] Uncaught Exception:
1 Dec 13:59:04 - TypeError: Cannot read property 'state' of null
    at ServerStateChangedNode.onHaEventsStateChanged (/opt/node_modules/node-red-contrib-home-assistant-websocket/nodes/events-state-changed/events-state-changed.js:84:57)
    at HaWebsocket.emit (events.js:327:22)
    at HaWebsocket.onClientEvents (/opt/node_modules/node-red-contrib-home-assistant-websocket/lib/ha-websocket.js:290:22)
    at Object.callback (/opt/node_modules/node-red-contrib-home-assistant-websocket/lib/ha-websocket.js:229:31)
    at Connection._handleMessage (/opt/node_modules/home-assistant-js-websocket/dist/haws.cjs:353:26)
    at WebSocket.<anonymous> (/opt/node_modules/home-assistant-js-websocket/dist/haws.cjs:191:57)
    at WebSocket.onMessage (/opt/node_modules/node-red-contrib-home-assistant-websocket/node_modules/ws/lib/event-target.js:125:16)
    at WebSocket.emit (events.js:315:20)
    at Receiver.receiverOnMessage (/opt/node_modules/node-red-contrib-home-assistant-websocket/node_modules/ws/lib/websocket.js:797:20)
    at Receiver.emit (events.js:315:20)
npm ERR! code ELIFECYCLE
npm ERR! errno 1
posted by b3nj1 over 4 years ago

If I am reading the errors correctly, the app is causing the issue which then causes a knock on effect in the Node Red Palette?

There really isn't much that can be done from the addon (as it just wraps the software), maybe log an issue with the palette for handling the exception (though looking above it may already have been fixed) and with the app repo?

posted by sinclairpaul over 4 years ago

Actually, I think there are 3 bugs in this display:

  • The mobile app, that is sending data which, it should not
  • The proximity integration, should not choke on invalid data
  • The Node-RED HA WebSocket, should catch those null values instead of freaking out

But, as Paul correctly stated; this is not something we can address from the add-on.

posted by frenck over 4 years ago

I'll continue using my local fixes until the Node-RED HA WebSocket fix is released.

Agree with summary, so I'll close this.

Thanks, everyone, for the help.

  • Benjamin
posted by b3nj1 over 4 years ago

Fund this Issue

$0.00
Funded

Pull requests