Медиа Центр

Here is an article on the problem of websocket output in a dataframe pandas with Binance:

The Problem: Infinite Data Outlet Loop at Pandas Dataframe

As you have managed to integrate your websocket connection to Binance in your script, it is essential to take up another common challenge which stems from this integration. The problem lies in how the data is collected and stored in a dataframe pandas.

When you use a websocket API like Binance Websockets, each message received by the customer is generally stored as a separate element in the data attribute of an object returned by the websocket connection. This can result in exponential growth of data in your pandas data, resulting in an infinite data output loop.

why does it happen?

In the Binance Websockets API, the messages are sent in pieces with a horoditing and message content. When you subscribe to several flows (for example, for the price of Bitcoin and pair volumes), each flow receives its own set of distinct messages. Since the websocket connection runs indefinitely, it will continue to receive new messages from each flow, creating an endless loop.

The Solution: Manage An Infinite Data Output With Pandas

To avoid this infinite data output and prevent the memory of your script from overflowing, you can use severe strategies:

1. Use dask

Dask is a parallel computer library which allows you to balance your calculation of large data sets without having a full -fledged cluster. Using Dask, you can break down the massive amount of data into smaller pieces and process them in parallel, reducing the use of memory.

Python

Import Dask.Dataframe as DD

Create An Empty Dataframa With 1000 Lines (A Reasonable Song Size)

d = dd.from_pandas (pd.datafaframe ({'price': np.random.rand (1000)}), nartitions = 10)

Perform Calculations on the Data In Pieces of 100 Lines At A Time

D ..Compute ()

'

2. Use the "number" stamp

If you work with large sets of binary data, consider using the number buffer approach to store and manipulate them more efficiently.

Python

Import Numpy As NP

from io import bytesio

Create An Empty List To Mintain The Data (As Numpy Stamps)

Data = []

Project Each Piece of Data In A Loop

For I at Range (1000):

Read 10,000 bytes of the Websocket Connection in the Stamp

chunk = np.frumberffer (with chunk_data* 10, dtype = np.int32) .tobytes ()

Add the song to the list (as a number stamp)

Date.Append (np.buffermaner (tampon = Bytesio (Piece))))))

Combine the stamps in one DATAFAFRAMA

df = pd.concat (data)

Now you can make calculations on the data set using dask or pandas

'

3. Use a streaming Library Processing Data

There are libraries like "Starlette" that provide streaming data processing capacities for Binance Websockets API.

Python

Starlette Import Web, httpview

Import asyncio

WebsocketProcessor Class (HTMLVIEW):

Asynchronized call (loan, request):

Get the Message from the Websocket Connection

Message = Wait for Request.json ()

Treat the message and store it in a dataframa (using a dask for effective treatment)

DF = DD.FROM_PANDAS (PD.Dataframe ({'Content': [Message ['Date']}), nartitions = 10)

Performing Data Calculations in Parallel Using DSK

Result = Wait Dask.Compute (DF) .Compute ()

Return Web.json_Respons (Result)

Start the Server to Manage Incoming Requests

App = Web. Application ([websocketsprocessor]))))))

web.run_app (app, host = '0.0.0.0', port = 8000)

` ‘

Conclusion

In Conclusion, the problem of infinite data output to a dataframe pandas of the Binance Websockets can be treated using strategies such as Dasting or using Night Buffers for Effective Treatment and Storage.