Cron job to remove logically redundant entries in Postgres SQL

I have a requirement to delete records from a Postgres SQL table which has more than 200 million records. The table is not having any primary key.

The sample table (Bookmark is the name of table) content is as below:

Expand|Select|Wrap|Line Numbers

     systemId     filename            mindatetime                    maxdatetime

      70277        monitor_1.dat    2019-04-21 08:00:00 AM        2019-04-21 03:10:00 PM

      10006        monitor_2.dat    2019-04-25 10:00:00 AM        2019-04-25 11:30:00 AM

      10006        monitor_3.dat    2019-04-28 08:00:00 AM        2019-04-28 10:00:00 AM

      10006        monitor_3.dat    2019-04-28 09:00:00 AM        2019-04-28 11:00:00 AM

      10006        monitor_3.dat    2019-04-28 07:00:00 AM        2019-04-28 04:00:00 PM

      8368        monitor_1.dat    2019-05-21 11:00:00 AM        2019-05-21 11:30:00 AM

      8368        monitor_7.dat    2019-05-21 06:00:00 AM        2019-05-21 11:00:00 AM

      8368        monitor_5.dat    2019-05-23 08:00:00 AM        2019-05-23 10:00:00 AM

The cron job should run on a given schedule to delete the records which are logically redundant.

To explain this let's take the case of systemId '10006' where filename is 'monitor_3.dat' having 3 entries with min and max date timestamp of the same day.

Logically we can delete the entries having mindatetime 08:00:00 AM and 09:00:00 AM, maxdatetime 10:00:00 AM, 11:00:00 AM as that interval is being covered by the other entry which has mindatetime as 7 AM and maxdatetime as 4 PM.

So those entries would fall under this interval and the job should identify such entries in the entire table and delete them.

My resultant output table content in this case should be:

Expand|Select|Wrap|Line Numbers

     systemId     filename            mindatetime                    maxdatetime

      70277        monitor_1.dat    2019-04-21 08:00:00 AM        2019-04-21 03:10:00 PM

      10006        monitor_2.dat    2019-04-25 10:00:00 AM        2019-04-25 11:30:00 AM

      10006        monitor_3.dat    2019-04-28 07:00:00 AM        2019-04-28 04:00:00 PM

      8368        monitor_1.dat    2019-05-21 11:00:00 AM        2019-05-21 11:30:00 AM

      8368        monitor_7.dat    2019-05-21 06:00:00 AM        2019-05-21 11:00:00 AM

      8368        monitor_5.dat    2019-05-23 08:00:00 AM        2019-05-23 10:00:00 AM

The table size is more than 20Gb on disk so I was exploring writing a sql procedure or job to achieve this but not able to make much progress. Any ideas or suggestions for overcoming this complex scenario?

Jun 14 '19 #1

Subscribe Post Reply

1845

Rabbit

12,516

Expert Mod 8TB

What happens if they only partially overlap? What happens if they fully overlap but over two different entries? What happens if they overlap but the start and end are the same? You need to clearly define the requirements otherwise you're going to run into trouble down the road.

Whatever the case may be, the answer will probably be to join the table to itself to find overlapping entries. How you formulate that join will depend on what you need to happen in the scenarios above.

Jun 14 '19 #2

by: MLH | last post by:

I never quite figured out how to reconfigure it to automatically delete redundant entries. Of course, one cannot always blatantly blow redundant records away w/o regard to which one it is that you...

Microsoft Access / VBA

backup using cron

by: Robert Fitzpatrick | last post by:

I'm sure this has been discussed many times. I can find references to the problem in the archives, but decided to query the list here instead of sifting through archives all day. Is there a way...

PostgreSQL Database

Trigger on Postgres for tables syncronization

by: Prabu Subroto | last post by:

Dear my friends... I am using SuSE Linux 9.1 and postgres. I am a beginner in postgres, usually I use MySQL. I have 3 tables : appointment, appointment0 and appointment1. the fields of...

PostgreSQL Database

logging producing redundant entries

by: Jed Parsons | last post by:

Hi, I'm using the logging module for the first time. I'm using it from within Zope Extensions. My problem is that, for every event logged, the logger is producing multiple identical entries...

Python

Reading SCRIPT_FILENAME thru cron

by: BG Mahesh | last post by:

hi I am using PHP 5.0.4 on OpenSuse 10.x. I have the following piece of code, $sp1 = $_SERVER; $sp1 is set correctly when I execute file.php thru the browser. But when I run that script...

PHP

Redundant data in postgres

by: lynux | last post by:

Hye, I'm quite new in postgres.I have 2 problems that i do not how to solve. Hope somebody can help me. 1. Although i put my zone field is unique, postgres sometimes redundant my data...

PostgreSQL Database

How to delete PyGTK ComboBox entries?

by: =?ISO-8859-15?Q?Ma=EBl_Benjamin_Mettler?= | last post by:

Hello list! I need to repopulate PyGTK ComboBox on a regular basis. In order to do so I have to remove all the entries and then add the new ones. I tried to remove all entries like that: def...

Python

removing duplicate entries in an array

by: joestevens232 | last post by:

Hello Im having problems figuring out how to remove the duplicate entries in an array...Write a program that accepts a sequence of integers (some of which may repeat) as input into an array. Write...

C / C++

How to remove spaces in text fields

by: YoungJohn | last post by:

I'm extracting data from a database to create a table. My table includes a text field 'Postcode' for postcodes. Sometimes the extracted postcodes are in the format SL37HY and in other...

Microsoft Access / VBA

Perl as a cron job: permissions error

by: davidiwharper | last post by:

Hello everyone, I am running a database maintenance script to remove old entries from a log file. The script runs as expected when initiated manually: ./maintain.pl $HOME/website/database/...

Perl

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Cron job to remove logically redundant entries in Postgres SQL

Similar topics